
Lead Software Engineer- Middleware Reliability Engineering
- Foster City, CA
- Permanent
- Full-time
- Drive Resiliency and Availability: Partner with Platform Engineering and Product teams to enhance middleware reliability across Visa's network, implementing operational best practices and integrating quality measures throughout the product development lifecycle.
- Champion Automation: Design and develop robust automation solutions using Python, Java, and Go to streamline deployment, monitoring, and incident response processes for our middleware infrastructure.
- Infrastructure as Code: Leverage your expertise in Terraform and Ansible to manage and provision infrastructure components, ensuring consistency and repeatability across our environments.
- CI/CD Optimization: Enhance our continuous integration and continuous delivery (CI/CD) pipelines using Jenkins and git to accelerate software delivery and improve code quality.
- Observability Enhancement: Integrate middleware products with Prometheus, Grafana, ELK, and in-house monitoring tools to provide comprehensive observability into system health, performance, and potential issues.
- Drive Innovation: Lead our evolution toward cloud-native solutions and modern DevOps and Observability practices.
- AI Integration: Develop integrations with AI/ML frameworks, chatbots, and agents to enhance automation and operational intelligence.
- Lead Technical Growth: Mentor team members on software development and promote DevOps best practices across the organization.
- Meaningful Impact: Your automation will help process billions of transactions globally
- Technical Growth: Work with cutting-edge technologies and shape architectural decisions
- Professional Development: Regular learning opportunities and conference attendance
- Work-Life Integration: Flexible hybrid schedule (2-3 days in office)
- Inclusive Environment: Join a team that values diverse perspectives and collaborative problem-solving
- Experience with HashiCorp Vault and secret management
- Understanding of security best practices and PCI and SOC compliance requirements
- Experience implementing and troubleshooting mutual TLS authentication protocols for secure service-to-service communication in distributed systems is required.
- Understanding of security best practices and PCI and SOC compliance requirements
- Knowledge of identity and access management principles
- You're passionate about automation and infrastructure as code
- You enjoy mentoring and knowledge sharing
- You approach problems systematically and thoughtfully
- You value collaboration and clear communication
- You're eager to learn and adapt to new technologies
- Embraces diverse perspectives and innovative solutions
- Promotes knowledge sharing and continuous learning
- Values work-life balance and sustainable practices
- Encourages experimentation and creative problem-solving
- Supports career growth and professional development
- Set the Middleware Reliability Engineering vision and strategy at Visa
- Mentor and grow technical talent
- Drive adoption of modern DevOps practices
- Influence architectural decisions
- Build systems that process billions in transactions
- Pioneer AI/ML integration in operational tooling
- 10+ years of relevant work experience with a Bachelor's Degree or at least 7 years of work experience with an Advanced degree (e.g. Masters, MBA, JD, MD) or 4 years of work experience with a PhD, OR 13+ years of relevant work experience.
- 12 or more years of work experience with a Bachelor's Degree or 8-10 years of experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or 6+ years of work experience with a PhD
- 10+ years of proven experience in Software Development or DevOps roles, with a strong focus on middleware technologies and infrastructure automation.
- Proven track record of automating complex tasks and processes to improve efficiency and reliability using Python, Go, Java, or similar.
- Solid understanding of Linux/Unix systems, networking protocols, certificate management, secret management, system design, cloud platforms (AWS, Azure, GCP), and containerization (Kubernetes, Docker)
- Proficiency with monitoring tools (Prometheus, Grafana, Datadog, etc.), logging systems (ELK stack, Splunk), and tracing tools (Jaeger, Zipkin).
- Hands-on experience with CI/CD pipelines and Jenkins.
- Proficiency in infrastructure-as-code tools such as Terraform and Ansible.
- Hands-on experience with CI/CD pipelines and Jenkins.
- Familiarity with AI/ML frameworks and chatbot/agent integrations for operational automation
- Understanding of middleware technologies (Message Queues, Service Bus, API Gateways, Caching servers)
- Experience with application servers (Tomcat, nginx, JBoss, WebSphere)
- Proficiency in troubleshooting and performance optimization of distributed systems at multiple layers.
- Bachelor's or Master's degree in Computer Science or related field, or equivalent experience
- We value hands-on experience and continuous learning over specific degrees