Research Data Analyst I Bioinformatics

Moffitt Cancer Center

  • Tampa, FL
  • Permanent
  • Full-time
  • 1 month ago
Position Highlights:
  • The Colorectal cancer epidemiology research program under Dr. Siegel is looking for a Research Data Analyst to oversee data management and analysis of data collected across the portfolio of colorectal and anal cancer protocols and grants.
  • This position will be managing and analyzing datasets from gastrointestinal (GI) epidemiological studies, including an ongoing national colorectal cohort study of over 800 Moffitt patients followed 5-years post diagnosis with longitudinal questionnaire data, biospecimen collections, and disease outcomes. This cohort is a resource for internal and external collaborators and data will support several ongoing funded projects.
  • This position is responsible for overall data management for the study cohort, which includes synthesizing data from several sources, including Moffitt's Health and Research Informatics (HRI) platform, Laboratory Information Management System (LIMS), and Research Electronic Data Capture (REDCap) system.
  • Providing support in data harmonization, data dictionary development and streamline data transfer to third party research projects
  • In addition, this position will be responsible for defining and creating analyzable datasets using the harmonized dataset and participate in writing and preparing first drafts of manuscripts molecular epidemiology research studies
  • A portion of this positions time will be dedicated to project management and support of the study coordination team.
The Ideal Candidate:
  • A professional, personable, organized, detail-oriented individual with experience managing several sources of data, analyzing data for epidemiologic studies, synthesizing literature reviews and summarizing study findings. The ability to multi-task well is highly desirable.
  • Possess data-driven problem-solving skills and creative thinking to optimize logistic processes of study development.
  • Experience in using data query, reporting, documentation, and visualization tools in MySQL, Shiny, R Markdown, and R/RStudio. Not required but experience with Google Cloud and in Linux/Unix environment is a plus.
  • Coding experience in SAS, STATA, or other relevant statistical software
  • Experience using streamlined software for data workflows such as GitLab and Application Programming Interface (API).
  • Manuscript writing skills are highly desired.
  • Research coordinator and project management experience is a plus
Responsibilities:
  • Oversees database support, data management and data analysis
  • Procure relevant data from several sources, conduct data cleaning, processing and recoding, prepare and document final analytic datasets. Prepare associated documentation for each dataset.
  • Review, validate, and report data and metadata from biorepositories and databases.
  • Develop Standard Operating Procedures outlining data acquisition processes and project development.
  • Conduct data quality checks to ensure data integrity and inferentiality.
  • Run basic descriptive and multivariate (logistic and survival-based regression) statistical analyses.
  • Works closely with PI or collaborators to prepare manuscripts for publication
  • Works closely with the Principal Investigator to manage ongoing research project activities and to provide guidance to research coordination team, as assigned.
  • Works closely with the Research Coordinator and Cores' staff to document metadata, sample acquisition, and project development.
Credentials and Qualifications:
  • Bachelor's Degree
  • Biostatistics, bioinformatics, computer science, computational biology, epidemiology, or related field plus 5 years relevant experience
  • OR
  • Master's degree in biostatistics, bioinformatics, computer science, computational biology, epidemiology, or related field with relevant experience is preferred.
  • Completion of an intense data boot camp is equivalent to 1 year of experience
  • Preferred experience: Proficiency with Microsoft Office software, Unix environment, and statistical packages such as R, Python, SAS or other relevant software.
  • Familiarity with Next Generation sequencing or other high-throughput bioinformatic data, high dimensional data wrangling and analysis techniques including machine learning, parallelization, and out-of-memory computation in a cluster environment, and high dimensional data visualization.

Moffitt Cancer Center