Principle Site Reliability Engineer

Dallas, TX
Permanent
Full-time

1 month ago

Every day, Global Payments makes it possible for millions of people to move money between buyers and sellers using our payments solutions for credit, debit, prepaid and merchant services. Our worldwide team helps over 3 million companies, more than 1,300 financial institutions and over 600 million cardholders grow with confidence and achieve amazing results. We are driven by our passion for success and we are proud to deliver best-in-class payment technology and software solutions. Join our dynamic team and make your mark on the payments technology landscape of tomorrow.Summary of This RoleResponsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. Creates a bridge between development and operations by applying a software engineering mindset to system administration topics. Splits time between operations/on-call duties and developing systems and software that help increase site reliability and performance.What Part Will You Play?

Work with various teams including DevOps, Development and Business partners to get requirements for changes to applications we need to account for in our build outs.
Participate in architecture and R&D discussions for new technology or processes to increase the performance and reliability of our systems.
Chaos engineering - you’re expected to think laterally about how our systems might fail in theory, design tests to demonstrate how they behave in practice, and then formulate and implement remediation plans, as appropriate.
Pushing our systems to their limits, and then coming up with designs for how to get them to the next performance tier.
Use practices from DevOps and GitOps to improve automation and processes to make self service possible.
Safeguarding reliability. Ensuring that our services are highly available, resilient against disasters, self-monitoring, and self-healing.
Running “game days” to test assumptions about reliability and learn what will break before it matters to customers.
Reviewing designs with an eye toward increasing the holistic stability of our platform and identifying potential risks.
Building systems to proactively monitor the health, performance and security of our production and non-production virtualized infrastructure.
Improving our monitoring and alerting systems to make sure engineers get paged when it matters (and don’t get paged when it doesn’t).
Troubleshooting systems and network issues, alongside our Technical Operations Team.
Mentoring other engineers in reliability-related skills.
Evolving our SDLC, practices, and tooling to account for Site Reliability considerations and best practices.
Developing runbooks and improving documentation.

What Are We Looking For in This Role?Minimum Qualifications

BS in Computer Science, Information Technology, Business / Management Information Systems or related field
Typically minimum of 8+ years - Professional Experience In Coding, Designing, Developing And Analyzing Data. Typically has a broad and comprehensive advanced knowledge of multiple opposing front / back end languages / technologies from the following but not limited to: two or more modern programming languages used in the enterprise, experience working with various APIs, external Services, experience with both relational and NoSQL Databases.
Experience in Public and Private Clouds, Jenkins, Terraform, Ansible, OpenShift, Kubernetes or AWS EKS

Preferred Qualifications

BS in Computer Science, Information Technology, Business / Management Information Systems or related field
10+ years professional Experience In Coding, Designing, Developing And Analyzing Data and experience with IBM Rational Tools

What Are Our Desired Skills and Capabilities?

Skills / Knowledge - Having broad expertise or unique knowledge, uses skills to contribute to development of company objectives and principles and to achieve goals in creative and effective ways. Barriers to entry such as technical committee review may exist at this level.
Job Complexity - Works on significant and unique issues where analysis of situations or data requires an evaluation of intangibles. Exercises independent judgment in methods, techniques and evaluation criteria for obtaining results. Creates formal networks involving coordination among groups.
Supervision - Acts independently to determine methods and procedures on new or special assignments. May supervise the activities of others.

Global Payments Inc. is an equal opportunity employer. Global Payments provides equal employment opportunities to all employees and applicants for employment without regard to race, color, religion, sex (including pregnancy), national origin, ancestry, age, marital status, sexual orientation, gender identity or expression, disability, veteran status, genetic information or any other basis protected by law. If you wish to request reasonable accommodations related to applying for employment or provide feedback about the accessibility of this website, please contact .

Global Payments

Apply Now