Whether you’re breaking new ground and training a model from scratch, fine-tuning a generic pre-trained model to extract better domain-specific performance, or continuously training a model from an existing checkpoint, Cerebras now has you covered. With this launch, users can fine-tune models like GPT-J, GPT-NeoX, and CodeGen faster, cheaper, and easier than with competing solutions.

What is pre-training?

Large language models are revolutionizing the way we interact with technology. With cutting-edge artificial intelligence technology, these models can understand and respond to natural language input with unprecedented accuracy and fluency. The potential applications for this technology are limitless, from customer service and product recommendations to language translation and content creation.

Training a large language model is an investment in the future of your business. With advanced machine learning techniques, large language models can be trained from scratch or fine-tuned to meet the specific needs and goals of your organization.

Training a large language model from scratch involves building a model from the ground up using a massive amount of training data. This process can take a lot of time, computing resources, and expertise. On the other hand, fine-tuning a large language model involves using a pre-existing, pre-trained model as a starting point and adjusting it for your specific use case. This process is much faster, less computationally intensive, and often requires less expertise. With fine-tuning, you can leverage the knowledge and capabilities of a pre-trained model and make specific adjustments to optimize performance for your application, delivering results that are better suited to your needs.

Challenges with Fine-Tuning

Fine-tuning is an attractive option for those who want to begin from a pre-trained checkpoint rather than training from scratch. This is often a smaller job than pre-training, but there are still challenges in fine-tuning large language models at scale with traditional systems – Cerebras makes this easy. While fine-tuning a model requires less training data, training time can still be too long. Fine-tuning large language models on traditional cloud GPUs often requires complex model parallel distribution, weeks or more of engineering time spent in setup, and long training times, even with the largest publicly-available instances. Not only is this timely, but the costs for setup, model distribution, experimental sweeps, fine-tuning, and re-training in a traditional cloud add up seemingly exponentially.

Finally, as users generate more data and a better understanding of their customers, they will want to fine-tune additional models and even train models from scratch. At this point, having a software platform that enables effortless changes to the training job, whether it be fine-tuning or training from scratch, is essential. In addition, having ownership of the trained weights becomes even more important as a user will need these weights to accelerate and improve down-the-line tasks such as fine-tuning.

Cerebras announces Fine-Tuning via the AI Model Studio

Cerebras addresses these issues, and more, with the launch of fine-tuning models on the Cerebras AI Model Studio. With the Cerebras AI Model Studio, users now have access to various large language models, including GPT-J (6B), GPT-NeoX (20B), and CodeGen (350M to 16B), with more models and checkpoints coming soon. These models are accessible for fine-tuning at competitive prices. Fine-tuning GPT-J 6B is $0.00096 per 1000 tokens with the Cerebras AI Model Studio, a stark discount when compared to fine-tuning OpenAI’s Curie (6.7B), which is $.003/1k tokens. Not only is it cheaper to fine-tune on the Cerebras AI Model Studio, but it’s 8x faster. Recall that we mentioned training GPT-NeoX with 10B tokens on traditional cloud would take 19 days; with the Cerebras AI Model Studio, it will only take 2.3 days. Cerebras achieves performance gains through the use of the Wafer-Scale Cluster and passes the savings onto the user. And with the Cerebras Model Studio, your data is always secure and remains yours; you own your own ML methods, and your trained weights are yours to keep.

Model Parameters (B) Fine-tuning price per 1K tokens Fine-tuning price per example (MSL 2048)  Fine-tuning price per example (MSL 4096)  Cerebras time to 10B tokens (h) AWS p4d (8xA100) time to 10B tokens (h)
Eleuther GPT-J 6 $0.00055 $0.0011 $0.0023 17 132
Eleuther GPT-NeoX 20  $0.00190   $0.0039   $0.0078  56 451
CodeGen* 350M 0.35  $0.00003  $0.00006  $0.000213 1 8
CodeGen* 2.7B 2.7 $0.00026  $0.0005  $0.0027 8 61
CodeGen* 6.1B 6.1 $0.00065  $0.0013 $0.0030 19 154
CodeGen* 16.1B 16.1 $0.00147  $0.0030  $0.011 44 350
* T5 tokens to train from the original T5 paper. Chinchilla scaling laws not applicable.
** Note that GPT-J was pre-trained on ~400B tokens. Fine-tuning jobs can employ a wide range of dataset sizes, but often use order 1-10% of the pre-training tokens. As such, one might fine-tune a model like GPT-J with ~4-40B tokens. We provide estimated wall clock time to fine-tune train the model checkpoints above with 10B tokens on Cerebras AI Model Studio and an AWS p4d instance in the table above to give you a sense of how much time jobs of this scale could take.

The Cerebras AI Model Studio benefits users that are interested in accessing large language models and fine-tuning these models faster and cheaper than the competition. In addition, The Cerebras AI Model Studio is advantageous for users that seek to mature their generative AI applications. For users that want to transition from fine-tuning to training from scratch, the Cerebras AI Model Studio offers a simple, push-button training experience.

Finally, the Cerebras AI Model Studio is accessible to all users, regardless of their experience training large language models. Experienced users can get access to a virtual environment for running experiments in a self-service manner via the Cerebras AI Model Studio. For users who are gaining their first experience with generative AI, Cerebras offers a white glove service for fine-tuning models. With our white-glove service, users can let the Cerebras world-class staff fine-tune a model and . For both user types, fine-tuning with the AI Model Studio is ideal for building out an MVP generative AI experience, testing out a product hypothesis faster, or iterating across different model architectures and sizes to see what works best. The Cerebras AI Model Studio is able to deliver faster training, lower cost, and easy-to-use service because of the years of innovation put into building systems that are optimized for AI and NLP applications. To learn more about all of Cerebras’ innovations in NLP, visit our site here.

How to get started

Fine-tuning with Cerebas AI Model Studio is easy. Contact Cerebras by emailing us at developer@cerebras.net or by filling out this form. Please also use this link if you are interested in a model that is not listed.

Udai Mody / February 16, 2023