SR HIGH PERFORMANCE COMPUTING SYSTEMS ENGINEER

Moffitt Cancer Center

  • Tampa, FL
  • Permanent
  • Full-time
  • 1 month ago
At Moffitt Cancer Center, we strive to be the leader in understanding the complexity of cancer and applying these insights to contribute to the prevention and cure of cancer. Our diverse team of over 9,000 are dedicated to serving our patients and creating a workspace where every individual is recognized and appreciated. For this reason, Moffitt has been recognized on the 2023 Forbes list of America’s Best Large Employers and America’s Best Employers for Women, Computerworld magazine’s list of 100 Best Places to Work in Information Technology, DiversityInc Top Hospitals & Health Systems and continually named one of the Tampa Bay Time’s Top Workplace. Additionally, Moffitt is proud to have earned the prestigious Magnet® designation in recognition of its nursing excellence. Moffitt is a National Cancer Institute-designated Comprehensive Cancer Center based in Florida, and the leading cancer hospital in both Florida and the Southeast. We are a top 10 nationally ranked cancer center by Newsweek and have been nationally ranked by U.S. News & World Report since 1999.
Working at Moffitt is both a career and a mission: to contribute to the prevention and cure of cancer. Join our committed team and help shape the future we envision.SummarySR High Performance Computing Systems EngineerPosition Highlights:
  • The Sr HPC Systems Engineer designs, develops, evaluates, and modifies software packages for the solution of scientific or engineering problems and for the support of research and development at Moffitt Cancer Center. The Sr HPC Systems Engineer analyzes existing systems and formulates logic for new systems. The Sr HPC Systems Engineer also devises logical procedures, prepares flow charts, performs coding, tests, and debugs programs. The individual will provide input for the documentation of new or existing programs, and determines system specifications, input/output processes, and working parameters for hardware/software compatibility. The Sr HPC Systems Engineer will also contribute to decisions on policies, procedures, expansion strategies, and product evaluations for the HPC resources. This position is focused primarily on working with the High Performance Computing (HPC) Cluster system. While the Director of Scientific Computing, HPC steering committee, Cancer Informatics Core Scientific Director and Manager will provide general project direction, the individual must exercise their own judgment for daily implementation and maintenance. The Sr HPC Systems Engineer applies technical expertise and background to work within a team of pure and applied scientists and software engineers to consult for and support scientific researchers who use HPC resources at the Moffitt Cancer Center. The Sr HPC Engineer engages with principal investigators and their labs, core facilities, and individual researchers. This position creates and optimizes computational solutions to the specific scientific computing needs of each constituency; ensuring that the appropriate technology resources are identified and utilized optimally. This position consults on applicable software packages and algorithms and assists in optimizing them for scalability and (massive) parallelization as needed. The Sr HPC Systems Engineer will also manage operational and automation tasks for the HPC Cluster system, including scheduling software, package management and version control, and assist the Director of Scientific Computing in reconfiguration of HPC management systems to best suit researchers’ needs.
Responsibilities:
  • Configures, debugs and ensures stable operation of HPC cluster tools such as Bright Cluster Manager, OpenHPC, Warewulf.
  • Configures, optimizes and ensures stable operation of software such as MATLAB and open source equivalents such as Octave and SciLab. In addition, ensures that any added module or toolbox needed for this software is working properly, including necessary licenses.
  • Configures, optimizes and ensures stable operation and availability of Message Passing Interface (MPI) such as OpenMP libraries and utilities.
  • Configures, optimizes and ensures stable operation and availability of the R statistical package and necessary extra modules needed by the HPC cluster users.
  • Configures, optimizes and ensures stable operation of GPU related software and libraries such as CUDA and others.
  • Configures, optimizes and ensures stable operation of specific software needed for mathematical oncology, bioinformatics, biostatistics and any other groups of research involved in the use of the HPC cluster.
  • Creates deployment scripts that facilitate the deployment of commercial off-the-shelf (COTS) and custom applications.
  • Estimates time and effort involved in realizing new cluster capabilities for enterprise-level resource allocation, project planning and forecasting purposes.
  • Automates solutions for routine tasks such as system deployments, database backups and open source software provisioning.
  • Collaborates with software developers to build, maintain, test and deploy user-friendly web-based interfaces that simplify scientists' views of their data and workflows.
  • Educates cluster user community on the optimal use of the cluster's computational resources via one-on-one collaboration, workshops and preparation of relevant documentation and tutorials.
  • Performs other duties as assigned.
  • Report and monitor systems for usage and system load
  • Investigate and implement new technologies within an HPC environment
  • Participates in cluster governance process
  • Provides cluster training and education
Credentials and Experience:
  • Bachelor’s Degree – field of study : Computer Science, Information Technology, Management Information Systems, Biology/Chemistry/Physics or related field
  • Minimum of seven (7 ) years' experience designing, developing and successfully administering or supporting Unix- based systems
  • Well-versed in cluster development methodologies in particular open-source operating systems, tools, languages and frameworks for cluster environments.
Proficient in one or more of the following languages:- Python- Shell scripting (i.e., BASH, CSH, KSH, etc.)- C/C++Proficient in one or more of the following tools:- GIT- Trac- Modules- Fabric/Puppet or other code deployment tools- Docker- Singularity/Apptainer*A High School diploma plus an additional four (4) years of relevant experience designing, developing and successfully administering or supporting Unix-based systems (for a total of eleven (11) years’ relevant experience) may be considered in lieu of a Bachelor's degree.Preferred ExperienceFamiliarity in one or more of the following Packages:-R- Matlab (or any open source similar tools like Octave, SciLab)- Familiarity with Slurm HPC scheduling- Working knowledge of computer hardware, networking concepts and tools- Working knowledge of protocols; such as SSH, DNS, DHCP, and LDAP.

Moffitt Cancer Center