TechnologyPermanent

Senior Site Reliability Engineer

Our client is a profitable developer-tooling company whose product is used by engineering teams at thousands of software companies for application monitoring and incident management. Their infrastructure runs across three AWS regions with strict 99.95% availability commitments and the SRE team is a senior, well-resourced group of nine.

As Senior SRE you will lead reliability initiatives across the platform — from defining and driving SLOs and error budgets, to running incident command for major outages, to building self-service infrastructure that lets product teams ship safely. You will work in Go and Python, with Terraform for IaC, and a Kubernetes-based deployment platform.

The role offers genuine technical scope, a thoughtful on-call rotation (compensated, capped weeks, frequent retros), and the chance to shape reliability culture at a company whose customers are themselves SRE and DevOps practitioners.

Requirements

6+ years of SRE, infrastructure engineering, or production operations experience
Strong proficiency in at least one of Go, Python, or Rust
Deep experience with Kubernetes, Terraform, and AWS (or GCP) at production scale
Demonstrated ownership of SLOs, error budgets, and incident response programs
Comfort writing public-facing postmortems and presenting reliability data to executives

Job details

Salary: $175,000 – $220,000 + Equity
Location: Denver, CO
Contract type: Permanent
Sector: Technology

Apply for this role

Or email hello@kovoro.com