Sr. Data Scientist
Perfect Path, LLC, d/b/a Trajector Services
- Gainesville, FL
- Permanent
- Full-time
- Competitive compensation ranging from $160,000 - $175,000 per year with total compensation ranging from $180,000 - $197,000.
- Medical, dental, vision, 401k program, and more
- Paid time off, including seven (7) federal holidays plus two (2) flex holidays for DEI
- Joining a rapidly growing organization
- Design, develop, and deploy scalable NLP systems to process, analyze, and extract information from medical and legal documents.
- Lead the exploration and application of cutting-edge NLP techniques, including transformers, large language models, information retrieval, recommendation, summarization, and personalization systems, to solve domain-specific challenges.
- Collaborate with cross-functional teams, including product managers, engineers, and domain experts, to define project goals, requirements, and deliverables.
- Drive end-to-end development of AI/ML solutions, including data preprocessing, model training, evaluation, deployment, and performance monitoring.
- Ensure solutions meet high standards of data security, privacy, and compliance
- Mentor junior data scientists, fostering technical growth and knowledge-sharing within the team.
- Stay abreast of emerging trends and technologies in NLP and machine learning and identify opportunities for their application in our systems.
- Contribute to the strategic direction of technology and product development within the organization
- Contribute to the long-term AI/ML technical vision and roadmap
- Master’s or Ph.D. in Computer Science, Data Science, Statistics, Computational Linguistics, or a related field.
- 5-8 years of professional experience in data science, preferably with a focus on natural language processing.
- Proven track record of solving business problems by delivering data science solutions at scale
- Expert in building and deploying machine learning models, including deep learning techniques and model fine-tuning
- Expert in data processing, feature engineering, analytics and visualization for structured and unstructured data
- Proficiency in Python
- Proficiency in SQL
- Proficiency in using source control systems like Github
- Proficiency in AI/ML/NLP frameworks such as Hugging Face, spaCy, scikit-learn, among others
- Proficiency in developing prompts for generative AI including evaluation of output
- Proven skills in translating complex data insights into clear, actionable business strategies
- Background of mathematical fundamentals including statistics, probability, linear algebra and optimization methods
- Experience in demonstrating the impact of data products using appropriate quantitative metrics
- Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and ML model deployment in production.
- Proven ability in contributing to complex projects and deliver results in a fast-paced environment.
- Experience working with medical or legal documents, including familiarity with domain-specific regulations (e.g., HIPAA, GDPR).
- Familiarity with OCR (Optical Character Recognition) technologies and integrating structured and unstructured data sources.
- Background in reinforcement learning, unsupervised learning, outlier detection, or graph-based NLP techniques.
- Familiarity with Python ML frameworks such as PyTorch or TensorFlow
- Background of linear algebra and neural network architecture
- Experience with MLOps tools and philosophies
- Publications or contributions to open-source NLP projects.