
Site Reliability Engineer – Sr. Consultant
- Austin, TX
- $143,200-207,800 per year
- Permanent
- Full-time
- Ensure the security and safety of application services and platforms. Spearhead the enhancement of operational practices focusing on efficiency, security, and excellence.
- Maintain zero downtime by swiftly addressing any issues to ensure environments are always operational. Conduct rapid root cause analysis and implement remediation in production environments after thorough testing.
- Oversee all activities within the environment, including deploying new code.
- Foster an inclusive, innovative, and collaborative team culture.
- Build strong partnerships with key stakeholders, including product management, engineering, design, and operations.
- Communicate effectively with both technical and business partners to create frameworks for discussing complex topics.
- Regularly analyze the environment and promote the adoption of automation and Generative AI to stay competitive.
- Lead cloud infrastructure and GenAI adoption and migration, ensuring a seamless transition with minimal downtime.
- Run problem bridges by collaborating with different functional and technical teams, escalating issues as needed for timely resolution.
- Proactively sharing important context and information with relevant stakeholders.
- 8 or more years of relevant work experience with a Bachelor Degree or at least 5 years of experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or 2 years of work experience with a PhD
- 9 or more years of relevant work experience with a Bachelor Degree or 7 or more relevant years of experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or 3 or more years of experience with a PhD
- Engineering degree in IT or Computer Science
- Having 8+ years of IT experience with expertise in DevOps, Build and release Engineering, Cloud Infrastructure and Automation, Tech support.
- 8+ years of experience with JAVA, J2EE applications, and a deep understanding of Web Services technologies: REST & SOAP.
- 4+ years of experience managing and troubleshooting applications on Containers (Docker) and Cloud (AWS, GCP, Azure).
- Knowledge of Generative AI capabilities and use cases to enable such capabilities in the environment.
- Ability to work as a team player
- Good written and communication skills
- Punctual to office time and work
- Great with problem solving and troubleshooting
- Ability to effectively prioritize and coordinate
- Ability to learn fast and implement the latest technology trends in the industry, particularly GenAI
- Good understanding of CI/CD technologies.
- Core Skills on Dockers, DevOps, Linux.
- DevOps experience with Jenkins, Ansible, Docker, Kubernetes.
- Good experience in java-based web applications
- Implementing CI CD process
- Expertise in Trouble shouting on java applications in Tomcat services and Web application in Apache.
- Good Exposure on Virtualization and Containers (Docker).
- Ability to build deployment, build scripts and automated solutions using scripting languages such as Shell scripting (Bash) / Java Script / Python / Other
- Worked with Docker and created multiple containers and images and had experience on writing the Docker file
- Created the deployments, services, and ingress flows for the application setup in the Kubernetes cluster.
- Participated in release level discussions and gone through the total SDLC and Agile methodology
- Support On-Call for all DevOps activities.
- Able to lead SWAT bridge as technical expert.
- Exceptional analytical and problem-solving skills, along with strong oral and written communication abilities. Proven proficiency in troubleshooting, root-cause analysis, application design, and implementing major components for large projects.
- Experience building tools to automate production support activities, enhancing the efficiency and productivity of service desk and operations groups.
- Prior experience working in 24x7 environments.