Senior DevOps/Site Reliability Engineer
8.0/10
Limit Break
$160,000 – $218,000 USD
Remote
senior
about 1 month ago
May be outdated
cryptotechweb3KubernetesAWSTerraformAnsibleCI/CDGitHub ActionsJenkinsGitLab CIAurora
AI Summary
The vacancy is well-structured with clear responsibilities and compensation, but lacks some company details.
Check Match — Just drop your CV
See your fit for Senior DevOps/Site Reliability Engineer in seconds.
Description
What you'll do
- •Identify, propose and execute improvements to performance and scalability bottlenecks across our multi-cluster EKS environment on AWS.
- •Measure systems health, scalability and performance metrics and identify areas of improvement.
- •Deploy services and troubleshoot production issues day-to-day, using code to solve broad operational challenges within the Limit Break Infrastructure and Platform.
- •Work with the wider engineering team to identify how we can provide the most production-like environment for running both manual and automated testing.
- •Define SLOs, SLIs, monitoring, alerting and incident response practices — and continuously improve our observability stack (Grafana, Thanos, Loki) to be ready for worldwide scale.
Requirements
- •5+ years experience in SRE, DevOps or Systems engineering.
- •Strong background in Kubernetes, including operating multiple EKS clusters in production.
- •Extensive experience in Terraform and Ansible.
- •CI/CD and automation experience with tools such as GitHub Actions, Jenkins, or GitLab CI.
- •Solid background in AWS, including experience with Aurora, RDS (MySQL/SQL), and networking.
- •Ability to participate in an on-call rotation.
- •Effective communication skills to clearly explain your reasoning and thought process.
- •Excellent collaboration skills to work closely with product engineers and product owners.
- •Implementation of in-house monitoring and observability infrastructure (e.g.
Grafana, Thanos, Loki, or equivalents).
- •Implementation of ElasticSearch stack or equivalent solutions for capturing logs from all environments.
- •Experience with CloudFlare, CDN technologies, and edge/perimeter networking.
- •Exposure to cloud security and perimeter tooling such as Wiz (or equivalent CSPM/vulnerability detection), AWS GuardDuty, CloudFlare Zero Trust, and secrets management platforms.
- •Experience addressing vulnerabilities — comfortable finding issues, digging deep to root cause, and driving remediation.
- •Implement various tools to monitor and protect the environment in real-time.
Loading similar jobs...