
HPC Network Engineer
- New York City, NY
- Permanent
- Full-time
- Network design, product selection, routing, configuration and troubleshooting
- Closely work with our HPC Systems team, as well as other teams to scale our research environment
- Meet with external parties for project related items as well as future endeavors
- Overall capacity planning for the data center networks
- Develop scripts and processes to increase efficiency
- Collaborate with others in the network team to solve cross discipline problems
- Provide detailed documentation
- Mentor others as a senior resource
- 10+ years of relevant experience
- Proven ability to prioritize workloads and manage implementation schedules with a meticulous attention to detail
- Proactive problem-solver with the technical skills and hands-on mentality to tackle challenges in the field.
- Strong collaboration skills and a focus on continuous improvement, with the humility to own and learn from mistakes
- Excellent oral and written communication skills: able to distill complicated issues into clear and concise terms
- Able to create and maintain clear and effective technical documentation
- In depth understanding of common layer 2 and layer 3 network protocols (OSPF, BGP, PIM, IGMP, RoCEv2, spine-leaf architecture, and VXLAN) and best practices for them
- Proven ability to select and work with vendors to engineer switches for next-generation computing environments
- Experience managing Arista (EOS), Cisco (NX-OS), Nvidia (Cumulus) and SONiC-based switches, as well as an interest in exploring new platforms
- Strong design skills in computing and grid networks, with expertise in capacity planning, and full lifecycle project management
- Experience with packet decoding and analysis tools such as tcpdump and Wireshark
- Experience with public cloud networks, such as AWS, GCP, Azure
- Familiarity with configuration management tools, such as Ansible or Salt is desirable to support zero-touch network management
- Familiarity with Python, Prometheus, Grafana, ELK, GitHub is desirable
- Understanding of basic power consumption and cooling issues in a data center environment
- Knowledge of fiber optics technology and cabling standards ranging from 1 to 800 Gbps. Ability to sort through specs and make recommendations on appropriate purchases.
- Skilled in Unix/Linux command line utilities and networking stack
- Experience with AI style network designs and workloads is preferred