Cerebras is developing a radically new chip and system to dramatically accelerate deep learning applications. Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation.
We are innovating at every level of the stack – from chip, to microcode, to power delivery and cooling, to new algorithms and network architectures at the cutting edge of ML research. Our fully-integrated system delivers unprecedented performance because it is built from the ground up for the deep learning workload.
Cerebras is building a team of exceptional people to work together on big problems. Join us!
You will work with the software design teams to analyze and optimize workload performance.
- Develop tools to analyze performance and identify bottlenecks and optimization opportunities.
- Develop performance infrastructure to validate, ablate and regress high performance kernels
- Bring up new kernel performance
- Develop automation to maintain end to end performance of workloads running on the Cerebras accelerator CS-1.
Skills & Qualifications:
- Senior architect, 10+ years of experience
- Strong programming: C++, Python, multi-thread, multi-process
- Experience with end-to-end workload analysis from low level assembly instruction code to high level distributed algorithms.
- Experience with performance analysis on: CPUs, GPUs, TPU, parallel architectures / distributed systems, dataflow / spatial architectures, many-core multi-thread environments
- Strong SW Engineering background and experience building test infrastructure
- PhD or Master’s degree in Computer Science, Electrical Engineering, or equivalent,
- Focus in computer architecture is desirable
- Programming/scripting experience in C/C++ and Python
- Headquarters/Los Altos Office
- Remote Office
- San Diego Office
- Toronto Office