Senior Platform Engineer
Senior Platform Engineer
Location: US (Remote)
About Platform9
Platform9 is the leader in enterprise Private Cloud. Founded by VMware cloud veterans, we build Private Cloud Director — software that turns your existing hardware into a full-featured, future-ready private cloud. We stay focused on one thing: exceptional customer outcomes.
Enterprises choose Platform9 to replace legacy virtualization because it removes operational complexity without forcing a rip-and-replace. Private Cloud Director gives infrastructure teams a familiar GUI for managing VMs and containers, seamless integration with existing hardware and third-party storage, and critical enterprise features like HA/DR, scale, and reliability — all while unlocking robust API control and vendor independence.
With over 30,000 nodes in production at companies like Cloudera, EBSCO, Juniper Networks, and Rackspace, Platform9 is the proven path to a modern, open private cloud. We are backed by prominent investors and supported by a partner ecosystem of resellers, SIs, MSPs, and technology vendors. Our values — innovation, customer obsession, ownership, radical candor, and excellence — guide every decision.
About the Role
We’re looking for a Senior Platform Engineer who treats cloud spend as a first-class engineering concern alongside reliability and performance. You’ll design, operate, and continuously improve our cloud infrastructure — and you’ll own the FinOps discipline that keeps it cost-efficient at scale.
Day to day, that means partnering with engineering, SRE and sales to build cost visibility, right-size resources, optimize commitment strategies, and automate everything from provisioning to deployment. You’ll also be a key escalation point for production incidents and a force multiplier for the broader engineering org through tooling and process improvements.
Responsibilities
- Design, implement, and maintain cloud infrastructure across multiple hyperscalers (primarily AWS), including Kubernetes clusters, OpenStack environments, and supporting services.
- Own cloud cost optimization end-to-end: analyze spend, eliminate waste, right-size resources, and manage commitment strategies (Reserved Instances, Savings Plans) to reduce total infrastructure cost.
- Establish and evolve FinOps practices across the org — cost allocation, chargeback/showback models, tagging policies, and spend forecasting — so engineering teams can make financially informed infrastructure decisions.
- Automate infrastructure provisioning, configuration management, and application deployments using Terraform, Flux, and similar tools.
- Build and maintain observability for both system health and cost efficiency using Prometheus, Grafana, Loki, and related tools; surface spending trends to engineering and leadership through clear dashboards and regular reporting.
- Develop internal tooling and scripts that reduce toil and improve operational leverage.
- Collaborate with engineering teams to design and maintain CI/CD pipelines.
- Participate in on-call rotation and serve as a senior escalation point for infrastructure, application, and performance incidents.
- Stay current on trends in cloud computing, DevOps, and cloud financial management, and bring relevant ideas back to the team.
Qualifications
- 5+ years in a DevOps or SRE role with deep experience in cloud infrastructure and operations.
- Demonstrated experience with FinOps principles — commitment management, rightsizing, waste reduction, cost allocation, and translating infrastructure decisions into financial impact for non-technical stakeholders.
- Extensive Kubernetes experience: cluster administration, deployment strategies, and production troubleshooting.
- Proficiency in infrastructure-as-code (Terraform, Ansible, or similar).
- Strong scripting skills in Python or equivalent; strong systems programming in Go or equivalent.
- Solid configuration management experience with Salt, Chef, or similar.
- Hands-on experience with observability tooling: Prometheus, Cortex, Grafana, Loki.
- Familiarity with CI/CD tools and best practices.
- Strong Linux administration and debugging skills.
- Excellent communication skills — you can explain an infrastructure trade-off or a cost anomaly to an engineer and a finance lead in the same conversation.
- Proven incident management experience.
- OpenStack experience is a plus, not a hard requirement.
Bonus Points
- EKS (Elastic Kubernetes Service) experience.
- Experience managing on-premise infrastructure.
- FinOps Foundation certification (FOCP) or equivalent.
- Experience with cloud cost management platforms (Archera, CloudHealth, Apptio Cloudability, AWS Cost Explorer).
- Familiarity with OpenTelemetry and AI-powered observability tools.
- Experience in a fast-paced startup environment.
Benefits and Perks
Employees today are looking for companies that truly care and recognise their whole person. Platform 9's benefits and perks have been carefully designed to ensure that we take care of an employee's emotions and physical well-being. Many of our benefits extend to families, who form a significant part of our well-being at work. Please note that benefits change by country.
- Competitive Compensation and Equity
- Medical Healthcare for you and your family
- Hybrid Work Model
- Wellness Benefits
- Professional Development/Global certifications
- Reward and Recognition Programs
- Team Building Activities
- Our benefits have been carefully selected, keeping in mind employees requirements and personal situations now and for the future
(Salary Range: $150-180K/year)