Director of Cloud Infrastructure & SRE (Golden)
New Yesterday
Director of Cloud Infrastructure & SRE
Want to make an application Make sure your CV is up to date, then read the following job specs carefully before applying.
The Team
In this role, you would be a part of the Engineering Leadership team and reporting to EVP, Technology.
Office Expectation
This role is HYBRID, with an expectation of 3 days a week in-office in Golden, CO. Core hours are 8-5pm.
Compensation
Targeting $175-$185K Base salary with 20% Bonus
The Overview:
Our client is seeking a strong Director of Cloud Infrastructure & SRE to lead the design, implementation, and optimization of their multi-cloud platform supporting real-time, high-volume financial data flows. This is a hands-on leadership role where you'll shape cloud strategy, define best practices, and scale mission-critical systems that power modern banking infrastructure.
You’ll head global teams of DevOps, SRE, and CloudOps engineers, champion Infrastructure-as-Code, AI/ML-based automation, and work with tools like Kubernetes, Terraform, CI/CD pipelines, Prometheus, and ELK. The ideal candidate brings deep technical expertise in Azure & AWS, strong security and compliance knowledge (SOC2, GDPR, NIST), and a passion for building resilient, secure, and scalable systems.
What you’ll own:
As a leader, you will have the opportunity to lead the DevOps & Cloud Infrastructure transformation in a rapid growing organization of multiple teams in delivering on business priorities while collaborating with development leaders and executives to define and advance best practices
We are seeking a strategic and experienced leader to oversee the cloud infrastructure, Site Reliability Engineering (SRE) for our large-scale, connected products ecosystem and CloudOps
Cloud Infrastructure & SRE Strategy
Define and execute global cloud operations and SRE strategies, ensuring 99.99%+ uptime for mission-critical financial services applications
Architect, implement, and optimize multi-cloud infrastructure to support financial services application with low-latency data processing, scalability, and high availability
Drive cost optimization strategies while balancing performance, redundancy, and financial efficiency across cloud platforms (Azure & AWS)
Develop automated deployment, monitoring, and recovery systems using technologies like Kubernetes, Terraform, Ansible, and CI/CD pipelines
Reliability, Performance & Incident Management
Establish and refine SLOs, SLIs, and KPIs for service reliability, performance, and capacity planning
Build and optimize incident management, disaster recovery, and resilience engineering frameworks
Leverage AI/ML-driven automation for proactive failure detection and remediation
Implement robust security practices and ensure cloud security, compliance with standards such as SOC2, GDPR, and NIST, and oversee the zero-trust security model
Collaborate with security and compliance teams to manage risk and ensure regulatory adherence across cloud platforms
Team Leadership & Cross-Functional Collaboration
Lead and mentor a team of DevOps Engineers, SREs, Escalation Engineers and SW professionals, fostering a culture of continuous learning and innovation
Partner with product management, software engineering, and customer support to optimize the software, scalability and performance
Collaborate with executive leadership to develop long-term cloud investment strategies
Requirements
Necessary Qualifications:
10 + years in Computer Science, Engineering, or a related field
10+ years of experience in Cloud Operations, SRE, or Infrastructure Engineering, with 8+ years in technical leadership roles
Experience managing large & Medium scaled, distributed cloud environments supporting millions of data connections per day
Deep professional experience in Azure and AWS cloud platforms including networking, storage, compute, and database services
Experience in Kubernetes, Terraform, CI/CD pipelines, and Application Monitoring & observability tools (e.g., Prometheus, Grafana, ELK, etc.)
Experience in large-scale systems design and architecture, with a focus on reliability, performance, and scalability of cloud-native platforms
Hands-on experience with tools like Terraform, Cloud Formation, Ansible, CDK, Pulumi for Infrastructure-as-Code (IaC), and managing cloud-native architectures
Strong background in AI/ML-driven automation for cloud infrastructure monitoring, self-healing, and optimization
Solid understanding of security-first cloud architectures, DevSecOps, and compliance standards (PCI, SOC2, GDPR, NIST)
Proven ability to manage teams across multiple global time zones, ensuring operational excellence and driving performance in large, distributed environments
Expertise in incident management, disaster recovery, and building resilience engineering frameworks
Ability and desire to review code, system designs, and engage in system engineering discussions and decisions
Expertise in serverless architecture, and edge computing
Strong financial acumen in cloud cost management, and forecasting
Familiarity with regulatory compliance frameworks such as SOC2, GDPR, PCI, and ISO 27001
Relevant certifications in Azure or AWS Cloud Practices
- Location:
- Golden