
SOC Architect, Platform Architecture
- Cupertino, CA
- Permanent
- Full-time
- Collaborate with domain experts to efficiently map large ML workloads to distributed systems. Understand the performance sensitivity of these workloads to networking parameters like bandwidth, latency, topology, routing, and error rate.
- Develop architecture specifications for networking IP.
- Collaborate with IP architecture, logic design, verification, firmware, software, and system teams to ensure end-to-end successful execution.
- BS degree
- Understanding of computer architecture and HW/SW partitioning
- Experience in architecting, designing and verifying high performance ASICs
- Understanding of distributed AI/ML workloads and data flows
- Understanding of HW and SW aspects of data center networking technologies like Ethernet, TCP/IP, RDMA/RoCE, NVLink, or similar
- 20+ years of relevant experience
- Understanding of state-of-the-art LLMs and how they use the compute, memory, and networking resources of modern machines
- Understanding of emerging networking standards
- Understanding of large-scale network behavior and its effect on high-performance distributed applications
- Understanding of network topologies, traffic management techniques, QoS, and error recovery
- Experience with networking hardware design
- Familiarity with relevant SW technologies like NCCL, UCX/UCC, PyTorch/JAX, MPI, SHMEM, and libfabric
- Ability to study a problem in depth, design experiments, analyze data, present results, and collaborate effectively across disciplines and geographies