With a highly successful SC22 conference in the rear-view mirror, I thought it was worth reflecting (sorry) on some of the highlights. In brief, we announced Andromeda, our groundbreaking AI supercomputer, we talked ourselves hoarse, and we won awards! For less brevity, read on.

Big News

Backing up a bit, we recently unveiled the Cerebras Wafer-Scale Cluster, which brings our novel weight-streaming technology to life. We can seamlessly combine many CS-2 systems – each of which already packs a cluster’s worth of AI and HPC compute onto one enormous chip – into clusters of arbitrary size. This technology has two features rare in this industry: no distributed programming and near-linear performance scaling.

As the exhibition opened, we introduced the world to Andromeda, the biggest Cerebras Cluster to date and one of the largest AI supercomputers ever built. Its 16 CS-2 nodes combine for a mind-boggling 13.5 million AI compute cores delivering more than an exaFLOP of sparse AI compute! And the performance results prove beyond doubt that weight-streaming works, and works at massive scale.

Customers and academic researchers are already running real workloads and deriving value from Andromeda’s extraordinary capabilities. If you have an “impossible” problem, we can help.

Not surprisingly, Andromeda caused something of stir, generating lots of top-tier press coverage, including articles from Ars Technica, The Register, Reuters and HPCWire. Andromeda also featured heavily in our booth, which brings me to my next highlight…

The Exhibition

It felt good to be back, in-person, with everybody who’s anybody in the AI and HPC communities. Our booth was a hive of activity, from before the show even opened to after the carpets were rolled up at the end.

An all-star cast of AI and HPC experts from Cerebras were on hand to answer questions on every subject from the microarchitecture of our Wafer-Scale Engine (still unique, still the chip that people want selfies with) to the inner workings of GPT-style neural networks or HPC algorithms.

I think this video clip I shot while circling our booth give a nice sense of the Cerebras experience at SC22, complete with the second most interesting hardware exhibit at the show. (Call me sentimental, of even disloyal, but the Cray 1 in the Hewlett Packard Enterprise booth wins that prize. We’ve come a long way in 50 years but to me, that’s the iconic machine that will always define what a supercomputer looks like.)

Making Compute-Bound HPC Possible

On the subject of supercomputing, we had some impressive results to share. Member of technical staff Mathias Jacquelin presented a paper, co-authored with French energy leader TotalEnergies, that showed a >200X speedup over an NVIDIA® A100 GPU on a 25-point stencil code workload – a key building block for seismic modelling.

And we were privileged to have Dr. Dirk Van Essendelft from the National Energy Technology Laboratory give a presentation in our booth theater about a simple Python API his team wrote to build applications such as computational fluid dynamics (CFD), which is now running on a CS-2 at Pittsburgh Supercomputing Center. I always find myself shaking my head in disbelief at his results: 470x faster than his entire Joule supercomputer, running OpenFOAM®, at about 1/1000th the energy! No wonder he considers Cerebras an example of the kind of computing technology the USA needs to ensure a sustainable energy future.

Cerebras’ Distinguished Engineer Rob Schreiber wrote a very good blog on this work, which you can read here.

HPC systems have struggled for years with falling memory and communications bandwidth relative to processing power, leading to the dreaded “memory-bound” or “communications-bound” labels for codes that just can’t keep their processors busy. By keeping entire workloads on-chip and avoiding all the complexity and compromise of a traditional memory hierarchy, Cerebras’ network and memory run at silicon speed. As a result, we keep hearing the magic words: “compute-bound” instead.

Speaking of Prizes…

HPCWire presented us with their Editor’s Choice award for Best AI Product or Technology for the CS-2 system. Here’s a photo of our co-founder and CEO, Andrew Feldman, receiving the award from HPCWire’s president and CEO, Tom Tabor.

But then, the big one. This is the second consecutive year that teams from ANL, Cerebras and other leading institutions have been nominated for the ACM Gordon Bell Special Prize for HPC-Based COVID-19 Research. Just being nominated is an honor, so we weren’t too disappointed when we didn’t win last year for a project that developed a host of new computational techniques to create a simulation of the virus’ replication mechanism that runs across 4 supercomputing sites. (You can learn more in this blog by our ML engineering manager Vishal Subbiah.)

This year, we were nominated, along with colleagues at Argonne National Laboratory and a host of other leading institutions for our project “GenSLMs: Genome-Scale Language Models Reveal SARS-CoV-2 Evolutionary Dynamics”. Congratulations to our Solutions Engineers, Claire Zhang and Cindy Catherine Orozco Bohorquez (pictured below), who co-authored the paper. Claire wrote a great piece about the project, “Genomics in Unparalleled Resolution“ which you can read here.

And this year, we won!

I was lucky enough to be at the ceremony, camera in hand, when the winners were announced. It was a great moment. I’m not ashamed to say that I nearly ran back across the exhibit hall to our booth to share the news.

And the icing on the cake? Andromeda was used in the study! Andromeda was able to easily train massive neural networks for the first time using the complete genomes of millions of virus variants. In fact, using just 4 nodes of Andromeda, we trained 1.3 billion parameter GPT-J models with the SARS-CoV-2 genomes to convergence in less than one day. We’ve subsequently shown linear performance scaling up to a whopping 25 billion parameters.  Models of this size, to date, can’t even run on a GPU-based supercomputer, let alone train to completion, because of the long sequence lengths necessary to achieve meaningful scientific insight. Cerebras allowed the team to do something previously intractable, and do it with ease.

It feels like the show came full circle for us, in a deeply satisfying way.

Now, if you’ll excuse me, I need to start working on SC23. See you in Denver!

Rebecca Lewington, Technology Evangelist | December 1, 2022