
AI Evaluation Data Scientist
- Cupertino, CA
- Permanent
- Full-time
- BS and a minimum of 10 years relevant industry experience in a empirical field with emphasis on quantitative methodologies of human behavior, including HCI, Psychometrics, Quantitative or Experimental Psychology, Educational Measurement, Language Assessment, or a relevant field
- Proficiency in Python and ability to write clean, performant code and collaborate using standard software development practices (e.g. Git)
- Strong statistical analysis skills and experience in crafting experiments, validating data quality and model performance
- Experience in building and extending data and inference pipelines to process large scale datasets
- MS or PhD or equivalent experience in relevant fields
- Real-world experience with LLM-based evaluation systems and human annotation and human evaluation methodologies
- Experience in rigorous, evidence-based approaches to test development, e.g. quantitative and qualitative test design, reliability and validity analysis
- Customer-focused mindset with experience or strong interest in building consumer digital health and wellness products
- Strong communication skills and ability to work cross-functionally with technical and non-technical stakeholders