
SR Lead Software Engineer - High Performance Computing
- Seattle, WA
- Permanent
- Full-time
- Regularly provides technical guidance and direction to support the business and its technical teams, contractors, and vendors
- Develops secure and high-quality production code, and reviews and debugs code written by others
- Drives decisions that influence the product design, application functionality, and technical operations and processes
- Serves as a function-wide subject matter expert in one or more areas of focus
- Actively contributes to the engineering community as an advocate of firmwide frameworks, tools, and practices of the Software Development Life Cycle
- Influences peers and project decision-makers to consider the use and application of leading-edge technologies
- Adds to the team culture of diversity, equity, inclusion, and respect
- Build scalable and efficient inferencing and training pipelines using HPC software techniques and patterns
- Working closely with business and data science teams, develop easy-to-use systems that serve their needs
- Using telemetry, create measurable frameworks for deciding amongst hardware and software options
- Publish and support re-usable patterns to optimize training and inference of ML models on various architectures
- Support developer community in learning lessons from high-performance computing (HPC) domain
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Hands-on practical experience delivering system design, application development, testing, and operational stability
- Advanced in one or more programming language(s)
- Advanced knowledge of software applications and technical processes with considerable in-depth knowledge in one or more technical disciplines (e.g., cloud, artificial intelligence, machine learning, mobile, etc.)
- Ability to tackle design and functionality problems independently with little to no oversight
- Practical cloud native experience
- Experience in Computer Science, Computer Engineering, Mathematics, or a related technical field
- Advanced understanding of High-Performance Computing system architectures and network topologies
- Expertise in at least one accelerator type (e.g., GPU, FPGA) and experience mapping LLMs onto these accelerators
- Proficiency parallel programming and performance analysis of accelerator-based systems
- Familiarity with HPC software (e.g., NCCL, MPI) and resource schedulers (e.g., Kubernetes, SLURM)
- Strong programming skills in Python, scripting, C, C++ with experience in AI/ML frameworks like PyTorch and LangChain
- Master's Degree in Computer Science (required)
- 8+ years of experience in high-performance computing software
- 5+ years of experience with accelerators and deep learning, particularly large language models
- Experience in large organizations and regulated industries is a plus
- Excellent communication skills and the ability to work collaboratively in a dynamic team environment