Job Description
Senior MLOps Engineer
We are seeking a Senior MLOps Engineer to steer the technical vision of our Training and Inference Optimization team. In this high-impact role, you will architect the infrastructure that powers our next-generation AI models. You will bridge the gap between systems programming and machine learning, optimizing large-scale LLM training via NVIDIA NeMo and building ultra high-throughput serving systems using vLLM, TensorRT-LLM, and SGLang.
Your mission is to ensure our models are not only state-of-the-art but also production hardened, cost-efficient, and performant at scale.
Key Responsibilities
• Training Infrastructure: Architect and maintain scalable distributed training pipelines using NVIDIA NeMo/Nemotron/Megatron-Bridge. You will optimize GPU utilization, manage complex checkpointing strategies, and implem...
Apply for This Position
Ready to take the next step? Click the button below to submit your application.
Submit Application