AWS Data Engineer (PySpark/Sagemaker)

Innovim Technology Solutions

Title: AWS Data Engineer (PySpark/Sagemaker)Location: New York, NY (Open to Remote)Duration: Long-term ContractKey Responsibilities:

Design, develop, and deploy PySpark ETL pipelines to migrate and transform actuarial data.
Build data ingestion pipelines from multiple source systems to Redshift using AWS Glue, DMS, and Step Functions.
Optimize performance using Redshift Spectrum for external table queries.
Automate data workflows using AWS Lambda, Step Functions, and Stonebranch job schedulers.
Implement and maintain CI/CD pipelines for deploying data applications and monitoring pipeline health.
Collaborate closely with actuarial teams, data analysts, and other engineers in an Agile environment.
Participate in sprint planning, story refinement, and backlog grooming using JIRA.

Required Qualifications:

5+ years of experience in building ETL/ELT pipelines using PySpark.
Proven experience with AWS Redshift and Redshift Spectrum.
Strong SQL skills for data extraction, transformation, validation, and performance tuning.
Hands-on experience with:

AWS Glue for data cataloging and ETL orchestration.
AWS Data Migration Service (DMS) for legacy-to-cloud migration.
AWS Step Functions and Lambda for orchestrating data workflows.
Proficiency in CI/CD pipelines, especially with tools like Jenkins, GitLab CI.
Experience with Stonebranch for job scheduling and monitoring.
Understanding of data governance, quality frameworks, and security best practices.

Preferred Qualifications:

Experience with AWS SageMaker or similar ML platforms for model deployment or integration.
Experience with AWS DataBrew or similar data wrangling tools
Previous work experience in actuarial, insurance, or financial services domains.
Working knowledge of Agile/Scrum methodology and familiarity with JIRA or similar tools.
Excellent communication skills with the ability to explain complex technical concepts to non-technical stakeholders.