Data Scientist
MaxCyte, Inc
- Waltham, MA
- $140,000 per year
- Permanent
- Full-time
- Collaborate closely with R&D and Data Science team to design and prototype bioinformatics workflows for NGS data analysis, contributing to the development of new assay offerings.
- Partner with Data Science and Product Development functions to design and prototype novel in-silico tools to lower off-target risk in gene editing strategies.
- Support bioinformatics engineering team members in translating prototyped pipelines into scalable, production software.
- Execute data analysis / bioinformatics pipelines while ensuring compliance with relevant regulatory standards (e.g, GLP standards) to support client projects.
- Conduct data analysis to gain deeper understanding of current methods, enabling improved characterization of methods.
- Perform data wrangling, statistical analysis and interpretation to enable data-driven decision-making and facilitate data interpretation.
- Develop and optimize computational tools and pipelines to support scientific research and data interpretation.
- Diligently generate thorough documentation for development, research and pipeline execution efforts to ensure clarity and compliance.
- Stay up-to-date with the latest data science trends, bioinformatics technologies, and best practices.
- Advanced degree in Data Science, Bioinformatics, Computational Biology, Genomics, or a related field (or equivalent work experience). Ph.D. + 1 year of experience or Masters degree with 6 years of experience
- Solid understanding of molecular biology, genomics, and next-generation sequencing (NGS) technologies with demonstrated ability to evaluate and select bioinformatics tools and build data analysis pipelines for NGS assays.
- Familiarity with CRISPR based gene editing technologies and assays is a plus.
- Demonstrated experience with bioinformatics tools and software (e.g., Samtools, BAM tools, GATK and other state-of-the-art variant calling tools for small indels, SNPs and large structural variations, etc.).
- Ability to independently analyze complex / multidimensional genomic data, create visualizations and drive decision making, storytelling through presentation of data.
- Strong proficiency in Python programming language. Additional languages including R are a plus. Proficiency with the Linux OS and command-line environment.
- Familiarity with version control and code management sch as Git.
- Familiarity with high-performance cloud platforms and related tools for high throughput computing. (e.g., AWS, NextFlow, Batch)
- Strong problem-solving skills with outside the box thinking and attention to detail.
- Excellent communication skills and the ability to work in a collaborative, interdisciplinary environment on an on-going basis.
- Communicate results effectively through reports and presentations to internal stakeholders.
- Assist in generating material to present findings to external clients and contribute to preparation of scientific publications.
- Ability to manage multiple projects and deadlines in a fast-paced setting.