
Associate Server Systems Administrator
- Nashville, TN
- Permanent
- Full-time
ACCRE Storage Systems Administration
- Maintain, administer, and improve ACCRE's computational services
- Set, implement, and audit user access controls
- Aid in operational security implementations
- Solve user support issues
- Troubleshoot hardware and software problems related to the storage
- Be the primary support for hardware break-fix and inventory management
- Be a member of the team developing, deploying, and supporting the compute cluster and auxiliary systems
- Work on adapting existing software tools to support rapid response to critical issues, system expansion and hardware replacement
- Set up/configure cluster hardware related to infrastructure and cluster services
- Install operating system and related utility software
- Monitor the status of the cluster utilizing tools such as CheckMK, including customizing the tools for ACCRE-specific needs
- Serve as a technical resource to users and other ACCRE staff members
- Coordinate critical tasks with other team members to meet project guidelines
- Act as internal technical consultant to ACCRE staff, particularly related to projects on which this position is serving as the primary systems administrator
- An Associate or Bachelor's degree from an accredited institution of higher education.
- Vanderbilt Export Compliance regulations designate that this position is limited to US citizens and permanent residents only
- The ability to physically move and lift hardware up to 50 pounds
- Two years of experience with system administration with UNIX/Linux based operating systems or managing compute cluster subsystems
- Demonstrated experience with shell and/or Python scripting of moderate complexity
- Demonstrated self-driven, inquisitive, and productive troubleshooting abilities
- Strong ability to work individually and in a team environment
- Knowledge and experience of GIT version control
- Knowledge and experience with configuration management tools such as Ansible
- Demonstrated success in taking initiative, meeting deadlines, and adjusting to operational shifts
- Experience with RedHat based systems
- Experience in an HPC environment
- Experience with server hardware (SAS, JBOD, RAID, HBA, RAID controllers, etc.)