Technical Program Manager - At Scale Engineering

Nvidia

  • Santa Clara, CA
  • Permanent
  • Full-time
  • 15 days ago
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. Created with a unique legacy of innovation that is fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.NVIDIA is looking for a highly-motivated Technical Program Manager (TPM) to join our Applied Systems Engineering Team to drive the design process for the next generation of NVIDIA AI supercomputing systems. The TPM plays a crucial role to define requirements and trade-offs in the design of the latest AI systems at scale, focusing on all layers of the stack from the datacenter and network architecture, through the hardware design and systems software.This role will drive collaboration between system architects and engineering leaders across multiple hardware and software teams, helping us work together to build AI supercomputers for internal use at NVIDIA and as a model for our customers.What you’ll be doing:Collaborate with outstanding engineers and architects to build and deploy large scale GPU computing systems based on NVIDIA's reference supercomputing architecturesDefine key product requirements and specifications to drive collaboration with architecture leads, systems engineers, and other program managersTrack the development of upcoming server, networking, and storage technologies across multiple product roadmaps to feed into integrated datacenter systemsCoordinate programs for designing new cluster architectures, adapting them to changing market requirements, and translating those designs into deployed computing systems for production useDocument system designs to facilitate the teamwork of multiple engineering groups working on datacenter deployments at scaleCommunicate internally with engineering leadership to prioritize and address key issues essential to the success of our largest customersWhat we need to see:BS (Masters preferred) in Applied Science or Engineering (or equivalent experience)5+ years of overall experienceExperience with accelerated computing systems, high-performance computing, and Linux-based operating systemsA passion for understanding challenging technical problems and driving the process of finding a solutionStrong teamwork and interpersonal skills, to facilitate building a collaborative workflow for coordination between many teamsWays to stand out from the crowd:Understanding of datacenter design, including familiarity with power and cooling technologiesExperience building and using large-scale cloud computing systemsExperience working with the engineering or academic research community supporting high-performance computing or deep learningThe base salary range is 128,000 USD - 247,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and . NVIDIA accepts applications on an ongoing basis.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.#deeplearning

Nvidia