10850 – Sr. Platform Engineer (Hadoop Admin)

Hyundai

  • Fountain Valley, CA
  • $103,170-158,873 per year
  • Permanent
  • Full-time
  • 1 month ago
  • Apply easily
Purpose:Hyundai AutoEver America is seeking a highly experienced Senior or Lead Platform Engineer/Site Reliability Engineer (SRE)/Hadoop Admin to manage and enhance our petabyte-scale, on-premises data platform. This platform is built using open-source Hadoop ecosystem. The ideal candidate brings deep technical expertise, a strong understanding of distributed systems, and extensive experience operating and optimizing large-scale data infrastructure. This role requires a hands-on technical leader who can drive platform innovation, ensure high availability and reliability, and mentor team members in best practices for performance, automation, and resiliency.Essential Functions:
  • Own and operate the end-to-end infrastructure of a large-scale, on-prem Hadoop-based data platform, ensuring high availability and reliability.
  • Design, implement, and maintain core platform components, including Hadoop, Hive, Spark, NiFi, Iceberg, ELK, OpenSearch and Ambari.
  • Automate infrastructure management, monitoring, and deployments using CI/CD pipelines (GitLab) and scripting.
  • Implement and enforce security controls, access management, and compliance standards.
  • Perform system upgrades, patching, performance tuning, and troubleshooting across platform components
  • Optimize observability and telemetry using tools like Prometheus, Grafana, and OpenTelemetry for real-time performance monitoring and alerting.
  • Proactively monitor system health, resolve incidents, and conduct root-cause analyses to prevent recurrence.
  • Collaborate with data engineering, analytics, and infrastructure teams to align platform capabilities with evolving needs.
  • Lead technical discussions, mentor junior engineers, and advocate for DevSecOps and SRE best practices.
  • Champion a culture of operational excellence by continuously improving reliability, automation, and performance.
Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities and activities may change at any time with or without notice.Basic Requirements:
  • Bachelor’s degree in computer science, Engineering, or a related field
  • 10+ years of experience in Platform Engineering, Site Reliability Engineering, or similar roles, with proven success managing large-scale, distributed Hadoop infrastructure.
  • Deep expertise in the Hadoop ecosystem, including HDFS, YARN, Hive, Spark, NiFi, Ambari, and Iceberg.
  • Strong Linux system administration skills (CentOS/Rocky preferred), including system tuning, performance optimization, and troubleshooting.
  • Proficiency in containerization and orchestration using Docker and Kubernetes.
  • Solid experience with automation and Infrastructure as Code, leveraging tools like GitLab CI/CD and scripting in Python and bash.
  • Practical knowledge of monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry) and understanding of system health, alerting, and telemetry.
  • Familiarity with networking concepts, security protocols, and data compliance requirements.
  • Experience managing petabyte-scale data platforms and implementing disaster recovery strategies.
  • Understanding of data governance, metadata management, and operational best practices.
  • Demonstrated ability to lead technical projects, mentor engineers, and collaborate effectively with cross-functional teams.
  • Excellent problem-solving, communication, and leadership skills.
Certification:Relevant certifications (e.g., Cloudera/Hortonworks) are a plus.Salary Range - $103,170 - $158,873Powered by JazzHR

Hyundai