Staff Site Reliability Engineer - Network Management Platform
Fastly
- Denver, CO
- Permanent
- Full-time
- Architect, build, deploy and maintain robust infrastructure using public cloud providers to run diverse types of workloads
- Maintain and improve configuration management processes, CI/CD pipelines, utilities, and tooling
- Consider scalability, security, performance, reliability and ease-of-use in the design of the services you develop and support
- Build automation that reduces manual interaction and promotes continuous improvement
- Foster relationships with other teams to understand and provide end-user needs
- In order to be successful in this role, you must have 5 years of experience operating highly available distributed systems and supporting mission critical infrastructure
- You have a minimum of 3 years of experience with build and packaging tools, CI systems, release engineering practices and CD platforms
- You have at least 3 years of experience with automating public cloud workloads (GCP, AWS), and using IaC solutions such as Terraform
- You have 3 years of experience in building and managing containerized applications and container orchestration platforms like Docker and Kubernetes
- You have a track record of instrumenting applications and integrating them with different observability solutions
- Experience in software development and DevOps best practices
- You have experience with configuration management software such as ansible and chef
- You are familiar with monitoring solutions like Prometheus and Grafana
- You have experience with compiled languages, preferably Go
- You have knowledge of key-value stores and sql databases
- You have experience collaborating with cross-functional engineering teams
- San Francisco, CA
- Los Angeles, CA
- Denver, CO