DV Site Reliability Engineer

Overview

We are seeking an experienced Site Reliability Engineer (SRE) to ensure the availability and performance of our cross-domain services used within high-profile government organizations. In this role, you will collaborate with various development teams and support teams to enhance our infrastructure and delivery pipelines while improving system observability and mitigating reliability risks. The ideal candidate will have a robust understanding of cloud hosting services and infrastructure management tools.

Responsibilities

Design and maintain reliable, scalable physical and virtual infrastructure.
Monitor system performance and proactively resolve issues.
Automate processes using tools such as Ansible to improve efficiency and consistency.
Collaborate with engineers and stakeholders across the business.
Support continuous improvement of systems, tools, and practices.
Operate across the full infrastructure stack, from bare metal systems to virtual deployments.

Requirements

Experience using modern configuration management tools (Ansible, Chef, or similar).
Experience working with Terraform.
Experience with docker containers and orchestration tools (Kubernetes, OpenShift, or Docker Swarm).
Experience with CI/CD tools (Jenkins or similar).
Familiarity with monitoring tools (InfluxDB, Prometheus, or Grafana).
Good understanding of relational databases and SQL.
Linux command line, administration, and shell scripting skills.
Experience with cloud hosting services (ideally AWS EC2, RDS, S3, Lambda).

Skills	SRE, Terraform, Kubernetes, OpenShift, SQL, AWS, Java, Python, Azure
Location	Cheltenham
Type	Hybrid
Rate	£500-£650/day Compare rates
Source	LinkedIn
Recruiter	TwinStream
Posted	04/07/26