Job Description
Role Overview
Support the deployment, scaling, optimization, and monitoring of AI/ML models in production environments. Work closely with data scientists and developers to ensure models run efficiently, reliably, and with fast inference performance.
Key Responsibilities
- Develop, maintain, and deploy ML/AI models into production environments.
- Build and serve model inference APIs using frameworks like FastAPI.
- Optimize models for better inference performance including quantization and model compression.
- Package and containerize models using Docker and manage deployments with orchestration tools (e.g., Kubernetes).
- Set up CI/CD pipelines and automation workflows for model deployment.
- Monitor model performance, latency, and reliability in production.
- Troubleshoot and resolve deployment, infrastructure, or inference issues.
- Collaborate with ML Engineers, Data Scientists, and DevOps ...
Apply for This Position
Ready to take the next step? Click the button below to submit your application.
Submit Application