Train 1-175 billion parameter models UP TO

8x

FASTER

than legacy cloud services

at a fraction of the cost

  • A dedicated cluster with millions of AI cores
  • Performance at the push of a button: the largest models, no complex distribution required
  • Simple setup: just SSH in and go
  • Simple programming: models in standard PyTorch and TensorFlow
  • Predictable fixed price blocks for standard model training, available now

Learn More About Free Trial

Cerebras AI Model Studio

The Cerebras AI Model Studio is a simple pay by the model computing service powered by dedicated clusters of Cerebras CS-2s and hosted by Cirrascale Cloud Services. It is a purpose-built platform, optimized for training large language models on dedicated clusters of millions of cores. It provides deterministic performance, requires no distributed computing headaches, and is push-button simple to start.

The Problem

Training large Transformer models such as GPT and T5 on traditional cloud platforms with graphics processors is painful, expensive, and time consuming. The largest instance typically offered in the cloud is an 8-way GPU server. It often takes weeks just to get access. Networking, storage, and compute cost extra. Set up is no joke. Models with tens of billions of parameters take weeks to get going and months to train. If you want to train in less time, you can attempt to reserve additional instances – but unpredictable inter-instance latency, makes distributing AI work difficult, and achieving high performance across multiple instances nearly impossible. The result is very few large models are ever trained in a traditional cloud.

Our Solution

The Cerebras AI Model Studio makes training large Transformer models for language or generative AI model applications fast, easy, and affordable. With Cerebras, you have millions of cores, predictable performance, no parallel distribution headaches – all of this enables you to quickly and easily run existing models on your data or to build new models from scratch optimized for your business.

A dedicated cloud-based cluster powered by Cerebras CS-2 systems with millions of AI cores for large language models and generative AI:

    • Train 1-175 billion parameter models quickly and easily
    • No parallel distribution pain: single-keystroke scaling over millions of cores
    • Zero DevOps or firewall pain: simply SSH in and go
    • Push-button performance: models in standard PyTorch or TensorFlow
    • Flexibility: pre-train or fine-tune models with your data
    • Train in a known amount of time, for a fixed fee

Key Benefits

Large models in less time

  • Train 1-175 billion parameter models 8x faster than the largest publicly available AWS GPU instance
  • Enable higher performing models with our longer sequence lengths (up to 50,000!)

Simple & Ease to Use

  • Easy access: simply SSH in and go 
  • Simple programming: range of large language models in standard PyTorch and TensorFlow 
  • Push-button performance: the power of millions of AI cores dedicated to your work with no distributed programming required 
  • Even the largest GPT models run without a single minute spent on parallelizing work 

Price

  • Models trained at half the price of AWS 
  • Predictable fixed price cost for production model training 

Flexibility

Train your models from scratch or fine-tune open-source models with your data

Ownership

Dependency free - Keep the trained weights for the models you build

Simple & Secure cloud operations

  • Simple onboarding: no DevOps required 
  • Software environment, libraries, secure storage, networking configured and ready to go 

Paid Access

Train your own state-of-the-art GPT model for your application on your data.

Standard Offering

  • Pick a large model from the list below (or contact us for special projects)
  • See the price, time to train: no surprises
  • SSH in and get going
    • Enjoy secure, dedicated access to programming environment for the training period
    • Cerebras model implementation for the chosen model appear
    • Systems, code examples, documentation are at your fingertips.
    • Scripts allow the user to vary training parameters, e.g. batch, learning rate, training steps, checkpointing frequency
    • Use Cerebras-curated Pile dataset to train upon if desired
  • Save and export trained weights and training log data from your work to use as you see fit

Additional Services

  • Bigger dedicated clusters to are available to reduce time to accuracy and work on larger models. 
  • Additional cluster time for hyperparameter tuning, pre-production training runs, post-production continuous pre-training or fine-tuning is available by the hour. 
  • CPU hours from Cirrascale for dataset preparation 
  • CPU or GPU support from Cirrascale for production model inference 

Introductory Pricing

These prices represent blocks of dedicated cluster time for the chosen model. Additional system time is available at an hourly rate as needed.

Model Parameters (B) Tokens to train to
Chinchilla point (B)
Cerebras AI Model Studio
CS-2 days to train**
Cerebras AI Model Studio
Price to train
GPT3-XL 1.3 26 0.4 $2,500
GPT-J 6 120 8 $45,000
GPT-3 6.7B 6.7 134 11 $40,000
T-5 11B 11 34* 9 $60,000
GPT-3 13B 13 260 39 $150,000
GPT NeoX 20 400 47 $525,000
GPT 70B 70 1400 85 $2,500,000
GPT 175B 175 3500 call for quote call for quote

* T5 tokens to train from the original T5 paper. Chinchilla scaling laws not applicable. 
** Expected number of days, based on training experience to date.  Actual training of model may take more or less time.

Interested? Try our free trial!

This is how to get started with the world’s fastest AI accelerators, quickly and easily, for free.

  • 2-day trial access to a cluster:
    • Secure, dedicated access to programming environment for the trial period
    • Systems, code, data, documentation provided by Cerebras
    • Ability to run models on 1, 2, or 4 systems within the cluster on a Cerebras-curated version of the open source Pile dataset
  • PyTorch or TensorFlow models to choose from, including: GPT 1.3B, 6B, 6.7B, 13B, 20B. Please see the Cerebras Model Zoo for examples
  • Trial programming environment:
    • Cerebras-provided Python scripts allowing the trial user to vary the number of CS-2 systems used to train and GPT model implementation / size
    • Cerebras-provided scripts allowing the trial user to vary learning rate, steps, checkpointing frequency
  • Access to trial trained weights

Learn more about free trial

FAQ

This offering is all about simplified access to large-scale compute to train large-scale language models in short time. We have provisioned specific CS-2 accelerator resources for each model above to deliver the throughput needed to reach the target number of tokens in the listed time. This way, when you use the selected Model Studio code and configuration, you can be sure you’ll complete training to the listed number of tokens. And you can always get more CS-2 resources if needed – contact us to learn more. The Model Studio has access to an elastic pool of CS-2 resources, up to and including large 16-node CS-2 wafer-scale clusters like Andromeda. 

The production training run is intended to be a single run from scratch to the listed number of tokens (for most models above, the number of tokens listed is defined by the Chinchilla model scaling laws for large language models). We understand that model development involves a lot more, and your production training run may not go off without a hitch – see below for more information on those issues. When you select a model and purchase one of our fixed price production training run offerings, you’re essentially getting a pre-defined pool of CS-2 resources for the listed amount of time. By using our code and configuration, we ensure sufficient accelerator resources to deliver the token throughput needed to train to the listed number of tokens in the advertised time. Need more time or more accelerator resources? No problem, we have both.

Yes! The Cerebras AI Model Studio is a fully provisioned model development facility. As needed, you can rent CPU hours for data preprocessing and input pipeline development, you can rent additional CS-2 accelerator resources as needed for hyperparameter tuning – pre-production experimental training runs – training eval as needed, and compute resources for production inference after training. We can support fine-tuning, continuous pre-training, and model re-training as well.

Not every run is perfect. If you run into issues, we’ll work together. If the issue is with Cerebras AI Model Studio code or systems, we’ll credit you back the time and help you start again. If the issue is a user or ML matter (e.g. suboptimal hyperparameters for the run), you’ll retain the remainder of your system allocation time and can procure more time as needed to finish the run to your desired end state. We’re here to help you be successful.

Yes! We will provide pre-trained checkpoints for the listed models trained on public, open-source data, and you can purchase CS-2 accelerator resources to train. Contact us to learn more.

Cerebras AI Model Studio provides a ready-to-use training environment, with all required software components already installed and configured. Running training jobs with a multi-million core AI system in the Cerebras AI Model Studio is as easy as running training jobs on a single GPU – no need to manage distributed resources, no need to worry about loading and saving checkpoints across many nodes, no need to think about complex hybrid parallelism strategies, no need to distribute shards of the optimizer state parameters. Cerebras AI Model Studio uses model implementations in the Cerebras Model Zoo in standard PyTorch and TensorFlow. Feel free to take a look!  

Contact us to learn more