
Manager II, Engineering - APM Root Cause Analysis
- New York City, NY
- Permanent
- Full-time
- A platform to ingest interesting changes from across our customer environments (Deployments, DB changes, Feature Flag changes, K8s changes, etc.)
- A system to process past incidents in our environment and label the faulty changes that led to the incidents, enabling us to build a high quality evaluation dataset for faulty change detection
- A system that uses LLM, ML, and statistical models to assess whether a specific change is the cause of an incident
- A product experience to expose those faulty changes in strategic locations in the product in a way that aids incident response and reduces MTTR
- Solve challenging and ambiguous problems of automating root cause analysis through faulty change detection using latest agentic AI approaches as well as ML anomaly detection and statistical methods
- Evaluate and benchmark the quality and real-world performance of the automated faulty change detection model
- Lead and mentor a team of experienced software engineers, fostering their career growth while ensuring high team performance
- Drive the technical roadmap in collaboration with your team, product managers, and design teams
- An experienced software engineering leader with a track record of successfully delivering GenAI/ML products at scale
- Experienced working with high scale distributed systems as well as participating in and structuring on-call processes for them
- You are passionate about building products that solve real user problems, you are adept at formulating an opinion on the product direction and how we should structure our execution strategy
- You have a BS/MS/PhD in a Computer Science, Engineering or related scientific field or equivalent experience
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
- Continuous professional development, product training, and career pathing
- Intradepartmental mentor and buddy program for in-house networking
- An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
- Access to Inclusion Talks, our Internal panel discussions
- Free, global mental health benefits for employees and dependents age 6+
- Competitive global benefits