
Senior Software Engineer-SRE
- Mountain View, CA
- Permanent
- Full-time
- Responsible for driving operational excellence for the connected services that a business offers to its customers to deliver an 'always on' operation, year-round, at the right cost
- Adopt observability best practices with distributed tracing to reduce time to detect (MTTD) and time to resolve (MTTR).
- Creating or Enhancing monitoring capabilities leveraging AI assisted tools to increase alert accuracy, detect issues and resolve automatically.
- Navigate into Products offered to customers to gain deep understanding of product knowledge and influence the engineering culture in developing observable applications.
- Creation of runbooks for standard operating procedures for every production change.
- Develop FMEA and chaos engineering best practices backed with automation.
- Investing in Self-service capabilities to drive efficiencies with focus on reducing friction and manual steps.
- Part of On-call rotation to respond to incoming alerts, triage and take necessary steps to minimize the impact.
- Contribute to infrastructure updates such as compute, storage, network and content changes.