Introducing gigaGPT: GPT-3 sized models in 565 lines of code

GigaGPT is Cerebras’ implementation of Andrei Karpathy’s nanoGPT – the simplest and most compact code base to train and…

Tensor Shape: Increasing Model Throughput

We write machine learning algorithms to fit the data, not pad the data to suit hardware limitations.

