Full-time Posted June 19, 2026
Apply Now

Job Description

Responsibilities

  • Design and implement AI/ML-powered solutions for infrastructure use cases, including predictive autoscaling, anomaly detection, intelligent cost optimization, and automated remediation across GCP and multi‑cloud environments
  • Build and maintain AI‑driven monitoring and observability systems that correlate logs, metrics, and traces to surface root causes, predict bottlenecks, and reduce mean time to resolution (MTTR)
  • Develop and operate automated incident response workflows using AI‑powered playbooks that diagnose, contain, and resolve infrastructure issues with minimal manual intervention
  • Integrate AI tooling into CI/CD pipelines to improve deployment reliability, automate test prediction, score release health, and support rollback automation
  • Contribute to the development of internal AI agents and virtual assistants integrated into developer workflows (Slack, IDEs, Confluence) — enabling self‑service for provisioni...

Apply for This Position

Ready to take the next step? Click the button below to submit your application.

Submit Application