Cerebras has developed a radically new chip and system to dramatically accelerate deep learning applications. Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation.
We are innovating at every level of the stack – from chip, to microcode, to power delivery and cooling, to new algorithms and network architectures at the cutting edge of ML research. Our fully-integrated system delivers unprecedented performance because it is built from the ground up for deep learning workloads.
Cerebras is building a team of exceptional people to work together on big problems. Join us!
You will develop high-performance software for communicating with the CS-1 at 1 Tb/s and beyond. Feeding configuration and training data from client systems to the CS-1 is a huge challenge that requires optimized data structures and algorithms that take full advantage of the available hardware resources, including CPU, memory, storage, and network bandwidth.
The software must be built with a high degree of concurrency across threads, processes, cores, and systems. The domain of this engineer is all the software between ML frameworks’ interfaces for hardware accelerators and the Linux IO system calls.
Skills & Qualifications:
- Senior architect, 10+ years of experience
- Strong programming: C++, Python, multi-thread, multi-process
- Track record of owning large, complex system software. HPC, cloud, data center experience with performance optimization.
- Prior projects where a large amount of HW is given, write SW to unlock potential of HW
- Host IO:
- Headquarters/Los Altos Office
- Remote Office
- San Diego Office
- Toronto Office