Job Description
Site Reliability Engineer (SRE)
Responsibilities
Design, implement, and maintain scalable and highly available infrastructures.
Monitor and ensure the performance and reliability of production systems.
Implement automation for recurring tasks and operational processes.
Collaborate with development teams to improve continuous delivery and codedeployment.
Respond to incidents and conduct post-mortem analysis to prevent future issues.
Optimize resource usage and manage system capacity.
Requirements
Experience in a similar role.
Knowledge of Unix/Linux operating systems.
Experience with monitoring and log management tools (Prometheus, Grafana, Splunk, ELK stack).
Scripting and automation skills (Python, Bash, Go, Shell).
Experience with cloud platforms (AWS, GCP, Azure).
Knowledge of containers and orchestration (Docker, Kubernetes).
Familiarity with CI/CD tools (Jenkins, GitLab CI/CD, CircleCI).
Experience in configuration ma...
Responsibilities
Design, implement, and maintain scalable and highly available infrastructures.
Monitor and ensure the performance and reliability of production systems.
Implement automation for recurring tasks and operational processes.
Collaborate with development teams to improve continuous delivery and codedeployment.
Respond to incidents and conduct post-mortem analysis to prevent future issues.
Optimize resource usage and manage system capacity.
Requirements
Experience in a similar role.
Knowledge of Unix/Linux operating systems.
Experience with monitoring and log management tools (Prometheus, Grafana, Splunk, ELK stack).
Scripting and automation skills (Python, Bash, Go, Shell).
Experience with cloud platforms (AWS, GCP, Azure).
Knowledge of containers and orchestration (Docker, Kubernetes).
Familiarity with CI/CD tools (Jenkins, GitLab CI/CD, CircleCI).
Experience in configuration ma...
Apply for This Position
Ready to take the next step? Click the button below to submit your application.
Submit Application