Position Summary
The Lead Cloud Automation Engineer serves as the primary point of contact and technical leader of the Cloud Automation team. The Lead Engineer plans, prioritizes, delegates and assigns tasks to ensure business and customer deliverables are met quickly and efficiently. He coordinates with leadership to determine goals and resolve issues; protectively reporting status, challenges, issues and constraints. The Lead Engineer monitors and adjusts priorities to balance workload and maintain team cooperation and output. The Lead Engineer reviews, critiques and edits team work products and documentation for readability, correctness and quality. The Lead engineer serves as a technical thought leader; following developments in the field, exploring products and technologies to improve infrastructure quality, compliance, time to market and cost. The Lead Engineer is responsible for designing, documenting, provisioning, and maintaining cloud infrastructure via code. The Lead Engineer prepares performance and cost-optimal cloud services to deliver infrastructure aligned with reference architecture standards, frameworks, and patterns. The Lead Engineer maintains the infrastructure and collaborates with operational teams for change management and incident response. The Lead Engineer excels when working on complex projects, is motivated to deliver results, maintains operational excellence, and models the JetBlue values of Safety, Caring, Integrity, Passion, and Fun.
Essential Responsibilities
- Write and execute Infrastructure as Code (IaC) pipelines capable of deploying standard cloud services including Virtual Networks, Firewalls, Load Balancers, Storage Accounts, Application Program Interface (API) Management Gateways, Kubernetes clusters, Messaging bus services, Managed databases, and Virtual Machines
- Write and execute code capable of managing Cloud governance policies, security, and cost management constructs
- Implement and leverage Configuration Management and GitOps to maintain infrastructure leveraging Ansible, Salt, and Terraform
- Plan and build Cost effective Platform as a Service (PaaS) solutions including provisions for high availability and disaster recovery
- Code and deploy infrastructure leveraging Availability Zones/Availability Sets
- Create and maintain infrastructure documentation and operational procedures using tools such as Confluence and Lucidchart
- Collaborate and provide knowledge transfer to operational support teams and colleagues
- Create monitor alerts and remediation workflows
- Build auto-remediation capabilities using automation frameworks and serverless functions
- Implement autoscaling features, container management policies, and specify virtual hardware to optimize for low cost and high application performance
- Monitor, operate, maintain, and improve cloud environment based on operational metrics, Service Level Agreements (SLAs), and best practices
- Other duties as assigned