
Data & ML Infrastructure Architect
- Tukwila, WA
- Permanent
- Full-time
- Own the architecture of ML data infrastructure, enabling scalable ingestion, storage, curation, and access for 100+ engineers and researchers across autonomy teams.
- Design and evolve infrastructure to support petabyte-scale machine learning workflows, including multimodal perception data, synthetic data, simulation output, and continuous training pipelines.
- Architect high-throughput systems for distributed training on large GPU clusters, driving significant improvements in utilization, throughput, and job efficiency.
- Establish robust data governance, observability, and retention strategies to ensure compliance, reproducibility, and long-term data utility.
- Collaborate cross-functionally with ML engineers, autonomy researchers, data engineers, and DevOps to ensure tight integration between infrastructure and user workflows.
- Lead technical strategy and roadmap development for the ML & Data Platform team, incorporating cutting-edge tools and best practices from industry and open source.
- Mentor and influence engineers across teams, promoting engineering excellence in distributed systems, ML platforms, and autonomy-scale data management.
- 15+ years of meaningful software engineering experience, including significant architecture-level ownership in ML, data infrastructure, or high-scale systems.
- Proven experience leading the design of ML platforms that serve large-scale training and inference workloads.
- Deep technical fluency in distributed storage, high-volume data pipelines, and data compression strategies for ML use cases.
- Strong knowledge of Linux systems, Python, and C++ or similar performance-oriented languages.
- Experience operating in hybrid environments: bare metal, HPC, and public cloud (AWS/GCP/Azure).
- Comfortable owning cross-org initiatives and influencing system-level design across autonomy, simulation, and platform teams.
- Prior work in robotics, autonomous vehicles, or safety-critical domains strongly preferred.
- Experience building or leading infrastructure at a top-tier ML/AI company or AV program.
- Background contributing to open-source ML or data infrastructure projects.
- Familiarity with ML experiment tracking, model evaluation pipelines, and versioned data systems.
- Job Type: Direct Hire
- Location: Remote
- Salary Range: $205 282k/year.