Full-time Posted May 24, 2026
Apply Now

Job Description

Requirements:

  • 4 years of experience as a fullstack or backend engineer
  • Strong proficiency in Python and JavaScript/TypeScript
  • Experience with FastAPI / Django / Node.js and React / Next.js
  • Solid understanding of distributed systems and async architectures
  • Hands-on experience deploying LLMs such as GPT-4/4.1, Claude, LLaMA, Mistral, Mixtral
  • Experience serving models using vLLM, Triton, TGI, or similar frameworks
  • Strong understanding of transformer models and inference trade-offs
  • Experience with embeddings, vector search, and RAG architectures
  • Experience with AWS, GCP, or Azure (GPU workloads preferred)
  • Strong Docker and Kubernetes experience
  • Familiarity with CI/CD pipelines for ML systems
  • Experience with observability tools (Prometheus, Grafana, OpenTelemetry)
  • Experience with multimodal AI (audio, video, image models)
  • Experience optimizing LLM inference...

Apply for This Position

Ready to take the next step? Click the button below to submit your application.

Submit Application