
Senior IaaS / Kubernetes Platform Engineer
9.0/10
CloudLinux
$115,000 β $195,500 USD
Remote
senior
about 1 month ago
May be outdated
devtechKubernetesIaaSCephTerraformAnsibleGitOpsNetworking
AI Summary
The vacancy is well-structured and informative, providing clarity on tasks, compensation, and requirements.
Check Match β Just drop your CV
See your fit for Senior IaaS / Kubernetes Platform Engineer in seconds.
Description
What You Will Do
- β’Kubernetes Platform Engineering (Primary Focus β 40%)
- β’Design, build, and operate a multi-tenant Kubernetes platform using Cluster API (CAPI) with bare-metal providers (Metal3/Sidero).
- β’Implement hard multi-tenancy using vCluster (Loft Labs) or similar technology, providing isolated Kubernetes API servers per tenant.
- β’Deploy and manage KubeVirt for VM orchestration within Kubernetes, including CPU pinning, NUMA awareness, and HugePages configuration.
- β’Implement GitOps-driven infrastructure using ArgoCD or Flux as the single source of truth for all cluster configurations.
- β’Deploy and manage Policy-as-Code using Kyverno or OPA Gatekeeper for admission control, resource quotas, and security policies.
- β’Build self-service capabilities using Crossplane or similar Kubernetes-native infrastructure provisioning tools.
- β’Storage Engineering (20%)
- β’Operate and optimize Ceph distributed storage clusters (currently 1 PiB raw, 149 OSDs, Quincy 17.2.5).
- β’Manage Rook-Ceph operator deployments at scale on modern Kubernetes (v1.28+).
- β’Implement storage tiering: Ceph for bulk storage, local NVMe for high-IOPS workloads, LINSTOR/DRBD or TopoLVM for ultra-fast replicated storage.
- β’Design and implement per-VM / per-tenant I/O isolation on shared Ceph clusters.
- β’Manage CDI (Containerized Data Importer) for VM image lifecycle in KubeVirt environments.
- β’Networking (15%)
- β’Deploy and manage overlay networks for pod networking, micro-segmentation, and WireGuard/IPsec encryption.
- β’Implement Cluster Mesh for multi-datacenter pod-to-pod connectivity.
- β’Configure Multus CNI and SR-IOV for multi-NIC VM support in KubeVirt.
- β’Work with physical network infrastructure: Juniper switches (JunOS), BGP (eBGP/iBGP), EVPN/VXLAN, VLANs.
- β’Maintain IPSec site-to-site connectivity between datacenters.
- β’Reliability and Operations (15%)
- β’Practice SRE discipline: define and maintain SLOs with error budgets, implement proactive capacity management with 6-12 month forecasting.
- β’Design and execute chaos engineering experiments to validate system resilience.
- β’Participate in on-call rotation for IaaS infrastructure (OpenNebula, Ceph, networking).
- β’Write and maintain runbooks, DRP documentation, and postmortem analyses.
- β’Drive proactive improvement: identify reliability risks, performance bottlenecks, and toil β then propose and implement solutions without waiting for incidents.
- β’Infrastructure as Code and Automation (10%)
- β’Develop and maintain Terraform/OpenTofu modules for multi-cloud infrastructure provisioning.
- β’Write Ansible playbooks for bare-metal server configuration and fleet management.
- β’Automate infrastructure lifecycle: PXE
What We Offer
- β’Competitive salary ranging from $115,000 to $195,500 USD.
- β’Fully remote work environment.
- β’Supportive team culture focused on collaboration and success.
- β’Opportunities for professional growth and development.
Requirements
Requirements
- β’Proven experience in Kubernetes platform engineering and IaaS.
- β’Strong understanding of cloud infrastructure, networking, and storage solutions.
- β’Experience with GitOps practices and tools (ArgoCD, Flux).
- β’Familiarity with Ceph and distributed storage management.
- β’Proficiency in Terraform and Ansible for automation and infrastructure as code.
- β’Ability to work independently and collaboratively in a remote team environment.
Loading similar jobs...