Site Reliability Engineer

Stellent IT

  • Wilmington, DE
  • Permanent
  • Full-time
  • 1 day ago
  • Apply easily
Job Title: Site Reliability EngineerLocation: Wilmington, DE (Onsite)Long Term ContractInterview: Phone, Video and/or In-person InterviewKey Responsibilities:
  • Lead and conduct detailed Root Cause Analysis (RCA) for incidents, identifying underlying issues and recommending corrective actions.
  • Document and communicate findings from RCA processes, ensuring transparency and knowledge sharing across the organization.
  • Develop and maintain incident postmortem reports, providing insights and actionable recommendations to stakeholders.
  • Monitor system performance and reliability metrics, proactively identifying potential issues before they escalate.
  • Contribute to the design and implementation of automated monitoring and alerting systems to improve incident detection and response times.
  • Continuously improve the incident management process, incorporating feedback and lessons learned from RCA activities.
  • Participate in incident response activities.
Qualifications:
  • Bachelor's degree or equivalent experience in a software engineering discipline
  • 6+ years of Software Engineering experience
  • Excellent communication skills, with the ability to convey technical findings to both technical and non-technical audiences
  • Excellent debugging and trouble shooting skills
  • Experience in Site Reliability Engineering, DevOps, or a similar role, with a focus on incident management and RCA.
  • Experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Dynatrace).
  • Familiarity with containerization technologies (e.g., Docker, Kubernetes).
Sandip Kumar
Sr. Tech RecruiterEmail:Address:
505 Knolle Court
Saint Augustine, FL 32092Telephone:
+1 321-641-0093

Stellent IT