Overview
We are seeking a knowledgeable Site Reliability Engineer (SRE) with valid MOD DV clearance to join a dynamic SME in the Defence sector. The contractor will work collaboratively with cross-functional teams to maintain system reliability and performance, leveraging their extensive experience in infrastructure management and automation tools.
Responsibilities
- Develop and implement infrastructure monitoring solutions using Prometheus or Grafana.
- Manage application deployment and orchestration using Docker, Kubernetes, and OpenShift.
- Utilize configuration management tools such as Ansible or Chef to automate setup processes.
- Design and maintain CI/CD pipelines in Jenkins to streamline development workflows.
- Oversee cloud infrastructure on AWS, managing EC2 instances, RDS, and S3 services.
- Create and manage infrastructure as code using Terraform for provisioning resources.
Requirements
- Proven experience with configuration management tools like Ansible or Chef.
- Hands-on expertise with Docker containers, Kubernetes, and OpenShift.
- Familiarity with Terraform for infrastructure automation.
- Experience with continuous integration and delivery processes, particularly using Jenkins.
- Knowledge of monitoring tools such as Prometheus or Grafana.
- Strong understanding of SQL and Linux administration.
- Proficient scripting skills in shell or similar languages.
- Active MOD DV clearance or eligibility to obtain it is necessary.