Job Description
NVIDIA has been at the forefront of the deep learning revolution, pioneering innovations that have transformed the entire field. As the leading provider of GPUs and AI computing platforms, NVIDIA has empowered researchers and engineers worldwide to accelerate breakthroughs in artificial intelligence.
We seek a versatile Senior Software Engineer who is passionate about performance optimization and generative AI. Our team brings the latest research in LLM inference — from novel decoding strategies to quantization schemes — into production across NVIDIA's hardware lineup, from large data center servers to powerful edge devices. We work on the most advanced architectures in the field, with a focus on NVIDIA's own.
What you'll be doing:
+ Implement and optimize inference algorithms for LLM and omnimodal architectures, including hybrid Mamba-Transformer and mixture-of-experts models
+ Profile inference pipelines using NVIDIA's profiling and simulation tool...
We seek a versatile Senior Software Engineer who is passionate about performance optimization and generative AI. Our team brings the latest research in LLM inference — from novel decoding strategies to quantization schemes — into production across NVIDIA's hardware lineup, from large data center servers to powerful edge devices. We work on the most advanced architectures in the field, with a focus on NVIDIA's own.
What you'll be doing:
+ Implement and optimize inference algorithms for LLM and omnimodal architectures, including hybrid Mamba-Transformer and mixture-of-experts models
+ Profile inference pipelines using NVIDIA's profiling and simulation tool...
Apply for This Position
Ready to take the next step? Click the button below to submit your application.
Submit Application