
Lead Machine Learning OPS Engineer
- Marlborough, MA
- $125,500 per year
- Permanent
- Full-time
- BJ’s pays weekly
- Eligible for free BJ's Inner Circle and Supplemental membership(s)*
- Generous time off programs to support busy lifestyles*
- Benefit plans for your changing needs*
- 401(k) plan with company match (must be at least 18 years old)
- Build, deploy, and manage machine learning models using applied statistics (supervised/unsupervised learning, feature selection, etc.)
- Translate complex ML concepts and findings for non-technical audiences
- Develop and implement scalable ML deployment strategies, ensuring models are production-ready
- Monitor, retrain, and fine-tune models in production to maintain optimal accuracy and performance
- Design and maintain robust ML infrastructure, including data pipelines and model serving infrastructure.
- Ensure infrastructure supports high-volume data processing and model training
- Implement and manage MLOps tools to support the end-to-end workflow
- Establish monitoring systems and alerts to track model performance and health
- Proactively address production issues related to model accuracy, drift, and latency
- Continuously optimize models and pipelines for cost, speed, and scalability
- Work closely with Business, data scientists, engineers, and IT teams to ensure seamless integration of ML models into production systems.
- Collaborate with cross-functional teams to understand business requirements and translate them into ML solutions.
- Mentor junior team members and promote best practices in model development and deployment
- Educational Background: Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related field.
- Professional Experience: Minimum of 4-6 years of experience in machine learning or ML Ops.
- Technical Expertise:
- Strong knowledge of ML algorithms (e. g., XGBoost, Prophet, matrix factorization, multi-armed bandit)
- Experience with model lifecycle management, including tracking, packaging, versioning, and deployment using tools such as MLflow
- Strong programming skills with Python and PySpark (preferred)
- Experience with CI/CD pipelines, automation tools, and practices
- Proficiency in monitoring tools and performance optimization techniques
- Experience with engineering and development collaboration tool such as GIT/Jira/Confluence
- Familiarity with cloud platforms (like AWS) and Databricks (preferred)
- Problem-Solving: Strong analytical and problem-solving skills.
- Communication: Excellent verbal and written communication skills.
- Team Collaboration: Ability to work collaboratively in a cross-functional team environment.
- Adaptability: Ability to adapt to new technologies and rapidly changing environments.
- Detail-Oriented: Attention to detail and a commitment to delivering high-quality solutions.