Site Reliability Engineer

Overview

We are seeking an experienced Site Reliability Engineer (SRE) with a focus on Observability to join a leading Wealth/Asset Management firm on a contract basis. The successful candidate will lead the implementation of a new observability solution, working collaboratively across digital platforms and engineering teams to enhance system reliability and performance. This remote role involves defining observability standards and driving data-driven decision-making to align with business priorities and digital objectives.

Responsibilities

Define and drive the observability roadmap in alignment with business and platform objectives.
Establish and monitor SLIs, SLOs, and error budgets to enhance system reliability.
Design and implement observability runbooks covering various monitoring metrics.
Assist engineering teams in implementing observability tools and monitoring systems.
Promote best practices and governance of observability across teams.
Collaborate with SRE, DevOps, and engineering teams for seamless integration of observability practices.

Requirements

Minimum 10 years of engineering experience, with at least 5 years in SRE or Observability roles.
Demonstrated experience in implementing observability solutions in cloud environments (AWS, Azure, GCP).
Proficiency with observability tools such as Datadog, Grafana, Prometheus, and OpenTelemetry.
Strong understanding of distributed systems, microservices, and container orchestration.
Experience with automation tools such as Terraform and Ansible, along with CI/CD pipelines.
Familiarity with performance engineering and telemetry-based insights.
Proficient in programming/scripting languages like Python or Go.
Knowledge of secure infrastructure practices, RBAC, and compliance requirements.

Skills	Python, AWS, GCP, Azure
Location	City Of Bristol
Type	Remote
Rate	£650-£750/day
Source	LinkedIn
Posted	06/11/25