Full-time Posted June 03, 2026
Apply Now

Job Description

What You Will Do

Operational Support & Incident Management

  • Provide 2nd level support for production systems and critical business applications.

  • Investigate, troubleshoot, and resolve incidents and performance issues.

  • Perform root cause analysis (RCA) and document findings in a structured manner.
  • Monitoring, Observability & Automation

  • Design, implement, and maintain monitoring dashboards.
    Improve alert quality and reduce noise through effective threshold and metric design.
    Analyze logs, metrics, and system behavior to proactively detect anomalies, automate operational processes using Ansible and scripting.
  • What You Bring

  • Operational Mindset & Collaboration

  • Proven experience in Site Reliability Engineering, DevOps, or 2nd level production support.

  • Effective communication skills and ability to work with cross-functional teams.

  • Technical Skills

    Apply for This Position

    Ready to take the next step? Click the button below to submit your application.

    Submit Application