
Site Reliability Engineer II
- Saint Louis, MO
- Permanent
- Full-time
- Infrastructure Management: Architect, build, and scale AWS infrastructure using Infrastructure as Code (IaC) tools such as Terraform.
- CI/CD & Deployment: Design, implement, and optimize CI/CD pipelines using tools like GitHub Actions, ArgoCD, or similar to streamline deployments and improve release velocity.
- Kubernetes Operations: Manage and optimize Kubernetes-based infrastructure (Amazon EKS) to ensure scalability, reliability, and efficient resource utilization.
- Observability & Incident Response: Build and maintain monitoring, alerting, and logging systems (Prometheus, Grafana, Datadog, Loki) to ensure high availability; participate in the on-call rotation to resolve incidents.
- Security & Compliance: Implement and maintain security controls to meet PCI DSS, HIPAA, GDPR, and SOC 2 standards, and support audit readiness.
- System Architecture: Contribute to designing fault-tolerant architectures with disaster recovery and high-availability strategies within and out of the CDE environments.
- Developer Enablement: Partner with developers to improve deployment workflows, reduce lead time for changes, and provide platform tooling support.
- Documentation & Knowledge Sharing: Create clear runbooks, technical documentation, and knowledge base articles to support team-wide learning and operational excellence.
- 3-5 years of experience in SRE, DevOps, or Platform Engineering roles, with at least 2 years in a senior or mid-level capacity.
- Strong hands-on experience with AWS services and IaC tools like Terraform.
- Expertise in Kubernetes operations in production environments (Amazon EKS preferred).
- Familiarity with compliance frameworks (PCI DSS, HIPAA, GDPR, SOC 2) and cloud security best practices.
- Excellent problem-solving, troubleshooting, and incident management skills.
- Experience supporting developers in platform engineering or internal tooling contexts.
- Familiarity with NIST Cybersecurity Framework (CSF) implementation in SaaS/cloud environments.
- Strong networking fundamentals (TCP/IP, DNS, HTTP, TLS, firewalls).
- Experience with AWS networking services (VPC, Route 53, NAT Gateway, ALB/NLB).
- Background in cost optimization and cloud governance.
- Strong scripting/programming skills (Bash, Python, Go).
- Automating infrastructure with the latest DevOps tools.
- Experimenting with AI-powered observability or security tools.
- Following the latest drops from AWS, CNCF, and open-source SRE communities.
- Reading engineering blogs, RFCs, and architecture deep dives.
- Playing with side projects that push the boundaries of automation…
- Fully remote team — work from anywhere in the U.S.
- Mission-driven culture with smart, supportive, and AI-obsessed teammates
- Career growth — this role is built for someone who wants to continue to level up
- Great benefits: healthcare, 401(k), unlimited PTO, learning stipends, and more
Apply at .Powered by JazzHR