Job Description
We're looking for a Senior/Staff AI Engineer — Inference & Agent Systems for a rapidly growing Fintech startup setting up their operations in India.
Why Join?
Get an opportunity to be of the founding member of the team and develop product from scratch.
About Role and the work:
Inference Optimization
Drive TTFT below 400ms for multi-step agent pipelines
Streaming optimization: first token to user while sub-agents are still running
KV cache strategy, prompt compression, dynamic context window management
Multi-provider routing: model selection by latency, cost, and task type across OpenAI, Anthropic, Gemini, and open-weight models
Infrastructure
Model serving and cold start optimization
Async worker architecture for parallel sub-agent execution
Observability: trace every token, every tool call, every synthesis step
What We're Looking For:
You've built something that runs in production at a meaningful scale and you understan...
Why Join?
Get an opportunity to be of the founding member of the team and develop product from scratch.
About Role and the work:
Inference Optimization
Drive TTFT below 400ms for multi-step agent pipelines
Streaming optimization: first token to user while sub-agents are still running
KV cache strategy, prompt compression, dynamic context window management
Multi-provider routing: model selection by latency, cost, and task type across OpenAI, Anthropic, Gemini, and open-weight models
Infrastructure
Model serving and cold start optimization
Async worker architecture for parallel sub-agent execution
Observability: trace every token, every tool call, every synthesis step
What We're Looking For:
You've built something that runs in production at a meaningful scale and you understan...
Apply for This Position
Ready to take the next step? Click the button below to submit your application.
Submit Application