Job Description
We are looking for a Senior ML Systems Engineer to build and validate simulation infrastructure for large-scale machine learning systems. This role focuses on modelling the compute and communication behaviour of systems used for ML training and inference, and using simulation to guide architecture, performance optimization, and capacity planning.
The ideal candidate combines strong systems experience with hands-on experience in measurement, benchmarking, and performance analysis of modern ML systems.
Experience:
The ideal candidate will have strong experience in ML systems, distributed systems, performance engineering, computer architecture, or simulation and hands-on experience with performance benchmarking, profiling, and measurement of ML systems.
You should have an understanding of systems used for machine learning training and inference, coupled with experience analysing compute, communication, and memory behaviour in large-scale ML systems.
Expe...
The ideal candidate combines strong systems experience with hands-on experience in measurement, benchmarking, and performance analysis of modern ML systems.
Experience:
The ideal candidate will have strong experience in ML systems, distributed systems, performance engineering, computer architecture, or simulation and hands-on experience with performance benchmarking, profiling, and measurement of ML systems.
You should have an understanding of systems used for machine learning training and inference, coupled with experience analysing compute, communication, and memory behaviour in large-scale ML systems.
Expe...
Apply for This Position
Ready to take the next step? Click the button below to submit your application.
Submit Application