Job Description

A leading technology firm in San Francisco seeks a GPU Optimisation Engineer to maximize GPU performance in real-time AI systems. The ideal candidate will possess strong experience with CUDA/Triton, a deep understanding of GPU execution, and a knack for optimizing inference latency for large generative models. With a competitive base salary of up to ~$300,000 and meaningful equity, this opportunity emphasizes growth rather than backfilling previous roles. Relocation and visa support is available. #J-18808-Ljbffr 

Apply for This Position

Ready to take the next step? Click the button below to submit your application.

Submit Application

Real-Time GPU Inference Optimization Engineer (Hiring Immediately)

Job Description

Apply for This Position