Full-time Posted June 25, 2026
Apply Now

Job Description

Key Responsibilities

Cross-Cluster Standardization: define and enforce incident management practices, standardize alerting, monitoring, and request handling, align workflows across ServiceNow and Jira, ensure consistency across all clusters.

Reliability Engineering: define SLO, SLA, MTTR, MTRS standards, identify systemic reliability gaps, drive incident reduction and prevention strategies, establish reliability as a measurable discipline.

Automation Strategy: identify cross-cluster automation opportunities, define reusable automation patterns and frameworks, eliminate duplicated operational solutions, drive reduction of manual toil.

Architecture Alignment: partner with Solution Architects across clusters, ensure operability built into system design, align monitoring, alerting, and failover strategies, prevent conflicting tooling or architectural decisions.

Governance and Reviews: lead cross-cluster SRE reviews, track adoption of standards, dr...

Apply for This Position

Ready to take the next step? Click the button below to submit your application.

Submit Application