Full-time Posted June 10, 2026
Apply Now

Job Description

Become a pivotal part of NVIDIA's team as a Senior Software Engineer specializing in AI inference optimization. Your skills in GPU kernel development and benchmarking will play a crucial role here.

This role demands seasoned software engineers dedicated to refining AI inference systems. You will actively participate in architecting and optimizing the vLLM inference framework, focusing on high-performance computing across GPU clusters. Your collaboration with various teams will help push the boundaries of accelerated computing.

Key Responsibilities: • Enhance vLLM's features to optimize new models • Benchmark and optimize GPU kernels using advanced methods • Create methodologies for industry-leading benchmarking tools • Design orchestration for large-scale inference deployments • Conduct original research for ML Systems advancements

Requirements: • PhD with top publications in ML Systems or relevant field • Expertise in programming with Python and C/C++ • Know...

Apply for This Position

Ready to take the next step? Click the button below to submit your application.

Submit Application