Bioinformatics Engineer - Production Pipelines (Clinical NGS)
SAGA Diagnostics
- Morrisville, NC
- Permanent
- Full-time
- Develop, implement and maintain robust, scalable pipelines for the analysis of Illumina whole genome sequencing data, with an emphasis on structural variant calling and QC metrics.
- Apply software engineering best practices including modular code development, unit testing, code reviews, documentation, and continuous integration to ensure reproducibility, maintainability, and compliance.
- Collaborate with cross-functional teams (R&D, QA, software, clinical operations) to identify opportunities to drive improvements in pipeline design, execution, automation and streamline data processing and analysis.
- Troubleshoot pipeline failures and bugs across production environments, conduct in-depth root cause analyses, and implement fixes with appropriate validation strategies and documentation, in accordance with quality standards.
- Optimize cloud infrastructure (primarily AWS) for performance, reliability, and cost-efficiency in high-throughput data processing.
- Optimize cloud infrastructure for cost-effective and scalable NGS data processes, primarily on AWS.
- MSc (with 5+ years industry experience) or PhD (with 3+ years industry experience) in bioinformatics, computer science, computational biology, or a related field. Prior experience in a clinical genomics or molecular diagnostics setting highly preferred.
- Strong programming skills in Python and proficiency in Linux/Bash.
- Demonstrated application of software development best practices, including version control (Git), CI/CD pipelines (GitHub Actions, GitLab CI, etc.), testing frameworks, and issue tracking systems.
- Experience in developing and deploying production bioinformatics workflows using workflow languages such as Nextflow or Snakemake.
- Experience with containerization technologies such as Docker or Singularity, and environment management with Conda or similar tools.
- Hands-on experience with cloud-based infrastructure (preferably AWS) for scalable data analysis workflows.
- Familiarity with software QA processes, including verification and validation (V&V) in clinical or regulated environments.
- Excellent written and verbal communication skills in English.
- Experience with Illumina sequencing data, including alignment, variant calling (especially SVs) and read-level QC.
- Strong grasp of cancer biology and molecular diagnostics.
- Experience with relational databases (SQL).
- Exposure to statistical or machine learning approaches for genomics data analysis.
- The opportunity to work with an incredible team with access to fantastic data
- As a member of a small team, you will be involved in every aspect of the business and help set the direction/culture as we grow.
- You will be given the autonomy and resources to deliver to the highest level.
- All the perks of a start-up – membership to SAGA’s Equity plan, highly competitive salaries, exciting technology and innovation, and a dynamic work environment.
- Competitive Compensation and company wide benefits plan
- Opportunities for career advancement and professional development.
- A collaborative and innovative work environment dedicated to improving oncology outcomes.