All jobs
TechnologyPermanent

Senior Site Reliability Engineer

Our client is a profitable developer-tooling company whose product is used by engineering teams at thousands of software companies for application monitoring and incident management. Their infrastructure runs across three AWS regions with strict 99.95% availability commitments and the SRE team is a senior, well-resourced group of nine.

As Senior SRE you will lead reliability initiatives across the platform — from defining and driving SLOs and error budgets, to running incident command for major outages, to building self-service infrastructure that lets product teams ship safely. You will work in Go and Python, with Terraform for IaC, and a Kubernetes-based deployment platform.

The role offers genuine technical scope, a thoughtful on-call rotation (compensated, capped weeks, frequent retros), and the chance to shape reliability culture at a company whose customers are themselves SRE and DevOps practitioners.

Requirements

  • 6+ years of SRE, infrastructure engineering, or production operations experience
  • Strong proficiency in at least one of Go, Python, or Rust
  • Deep experience with Kubernetes, Terraform, and AWS (or GCP) at production scale
  • Demonstrated ownership of SLOs, error budgets, and incident response programs
  • Comfort writing public-facing postmortems and presenting reliability data to executives

Job details

Salary
$175,000 – $220,000 + Equity
Location
Denver, CO
Contract type
Permanent
Sector
Technology
Apply for this role

Or email hello@kovoro.com