Infrastructure Engineer
datma
- Salt Lake City, UT
- Permanent
- Full-time
- Architect, deploy, and manage Kubernetes clusters running in customer cloud tenancies (AWS, Azure, GCP).
- Create robust infrastructure-as-code templates (e.g., Terraform, Helm) for repeatable deployments.
- Implement scaling, monitoring, disaster recovery strategies, and observability solutions (metrics, logging, tracing) for proactive infrastructure management.
- Automate deployment processes for data pipelines, ML models, and analytics applications - including automated testing - to improve release velocity and stability.
- Manage containerization and orchestration of data services and workloads using Docker and Kubernetes.
- Troubleshoot performance and reliability issues across environments.
- Evaluate and recommend infrastructure solutions by conducting cost-benefit analyses comparing open source vs. cloud-native alternatives.
- Implement and maintain security controls aligned with HIPAA and HITRUST frameworks. Partner with compliance teams to ensure infrastructure supports ongoing certification and audit requirements.
- Configure secure networking, identity and access management (IAM), encryption (in transit and at rest), and audit logging.
- Build infrastructure to host both in-house AI models and integrate with external AI services (e.g., GPT-5 via OpenAI APIs).
- Optimize data pipelines and storage for AI training and inference workloads. Support GPU-based compute environments for ML workloads when required.
- Design and manage scalable API gateways and authentication mechanisms for external data consumers.
- Ensure API infrastructure can handle high-throughput, low-latency access to sensitive healthcare datasets.
- Collaborate with the data/applications team to develop and optimize data processing pipelines, using data orchestration tools like Prefect or cloud-native solutions, and support diverse client integrations (Python, R, SQL, BI tools, etc.).
- 3+ years of experience in cloud infrastructure engineering, preferably in a regulated data environment.
- Deep expertise with Kubernetes and container orchestration in production.
- Strong proficiency in Infrastructure as Code tools (Terraform, Helm, Ansible, etc.).
- Experience with cloud security best practices and regulatory frameworks (HIPAA, SOC 2, or HITRUST).
- Hands-on experience with CI/CD pipelines and monitoring tools (e.g., Prometheus, Grafana, ELK).
- Proficiency in Python and/or Go, SQL, and bash scripting.
- Understanding of data modeling, warehousing concepts, and data pipeline orchestration tools.
- Experience deploying in customer-owned cloud environments.
- Familiarity with secure API design and management (OAuth2, JWT, API gateways).
- Knowledge of machine learning infrastructure and MLOps practices.
- Background involving healthcare data and associated interoperability standards (FHIR, HL7).
- Prior work supporting HITRUST certification efforts.
- Experience with multi-tenant architecture design.