Job Description
Senior Software Engineer, AI Inference page is loaded## Senior Software Engineer, AI Inferencelocations:
Canada, Torontotime type:
Full timeposted on:
Posted 2 Days Agojob requisition id:
JR Help us push the boundaries of AI inference at NVIDIA — where your systems expertise shapes both the technology and the teams building on top of it!**What You'll be doing:*** Work directly with customer engineering teams through long-term technical partnerships, understanding their LLM serving architectures and performance goals, then designing and implementing end-to-end benchmarking campaigns across Kubernetes and Slurm environments to surface actionable insights.* Set up and operate vLLM serving deployments on GPU clusters, tuning configurations for throughput, latency, and efficiency — and collect Nsight Systems / Nsight Compute profiling traces to identify performance gaps relative to reference frameworks.* Develop detailed performance plans based on profili...
Canada, Torontotime type:
Full timeposted on:
Posted 2 Days Agojob requisition id:
JR Help us push the boundaries of AI inference at NVIDIA — where your systems expertise shapes both the technology and the teams building on top of it!**What You'll be doing:*** Work directly with customer engineering teams through long-term technical partnerships, understanding their LLM serving architectures and performance goals, then designing and implementing end-to-end benchmarking campaigns across Kubernetes and Slurm environments to surface actionable insights.* Set up and operate vLLM serving deployments on GPU clusters, tuning configurations for throughput, latency, and efficiency — and collect Nsight Systems / Nsight Compute profiling traces to identify performance gaps relative to reference frameworks.* Develop detailed performance plans based on profili...
Apply for This Position
Ready to take the next step? Click the button below to submit your application.
Submit Application