
Site Reliability Engineer
- Chicago, IL
- Permanent
- Full-time
- Infrastructure & Architecture: Design, implement, and operate highly available, scalable, and fault-tolerant infrastructure primarily on GCP, but to include multi-cloud deployments. Optimize system performance, manage disaster recovery, and ensure cost-effectiveness.
- Infrastructure as Code: Lead Terraform-based infrastructure development with security best practices, encrypted state management, and governance tools.
- CI/CD & DevSecOps: Build robust pipelines supporting hundreds of developers and AI engineers. Integrate automated security testing, vulnerability scanning, and compliance checks throughout the development lifecycle.
- Monitoring & Incident Response: Implement comprehensive observability strategies using Prometheus, Grafana, and ELK. Define SLOs/SLIs, manage error budgets, and lead incident response with blameless post-mortems.
- Compliance & Security: Navigate complex regulatory requirements for U.S. Aerospace and Defense Industrial Base. Collaborate with security and legal teams on expanding compliance standards.
- Automation & Collaboration: Reduce operational toil through Python, Go, or Bash automation. Work in a follow-the-sun model with global teams while taking primary responsibility for US platform partition incidents and operations.
- Bachelor's degree in Computer Science, Engineering, or equivalent experience
- 4+ years in Site Reliability Engineering, DevOps, or Systems Engineering with cloud-based SaaS platforms
- Deep Terraform and Infrastructure as Code expertise with security best practices
- Proficiency in Python and other scripting/programming languages
- Modern CI/CD experience (Github Actions, GitLab CI, Jenkins, ArgoCD, Spinnaker) including AI/ML workloads
- Strong cloud platform experience, preferably GCP (AWS, Azure experience also valuable for future multi-cloud deployments)
- Experience building and optimizing containers (Docker) and configuring orchestration (Kubernetes)
- Monitoring tools experience (Datadog, Prometheus, Grafana, etc.)
- Regulated industry experience (Aerospace & Defense, Finance, Healthcare) with experience building secure platforms
- DevSecOps principles and security integration experience
- Security-first development mindset with understanding of secure infrastructure practices
- Strong problem-solving and communication skills for distributed team environments
- Hyper-growth startup experience
- AI Safety experience
- MLOps and AI/ML infrastructure security experience
- Competitive Site Reliability Engineer salary in the Chicago market with company-paid healthcare benefits, 401k matching, generous time off, and work/life balance.
- In-depth experience in various aspects of the international tech start-up environment in Chicago.
- Opportunity to contribute to developing and implementing a winning strategy as a foundational member, where you will put your stamp on the foundation of CADDi moving forward in the US.
- Exposure to cross-functional collaboration and leaders within a growing startup environment, where your voice will be heard.
- The chance to directly impact customer satisfaction, retention, and business growth, helping multiple manufacturing businesses succeed and grow in the US.
- Comprehensive Health Benefits: We provide 100% company-covered employee comprehensive health insurance, including medical (UnitedHealth), dental (Principal), and vision (VSP) to keep you and your family healthy.
- Ownership & Rewards: Be a part of our success story with a competitive stock options plan.
- Financial Security: Start saving for your future with our 401k plan, featuring a generous 4% company match starting on day one.
- Generous Time Off: Maintain a healthy work-life balance with 15 days of paid time off, five dedicated sick days, and ten company holidays to celebrate throughout the year.
- Thriving Culture: We foster a vibrant work environment with delicious company lunches, engaging events, and healthy drinks and snacks to keep you fueled. Celebrate your achievements with us at quarterly events and holiday gatherings.
- Learning & Development: We invest in your growth by providing opportunities to join professional organizations, attend industry conferences, and participate in various learning initiatives.
- Financial Incentives: Benefit from commuter and parking benefits to simplify your daily commute. We also offer referral bonuses to help you spread the word about exciting opportunities at CADDi.