
Senior Data Engineer
- Newark, NJ
- $107,700-160,500 per year
- Permanent
- Full-time
- Build and optimize data pipelines, logic and storage systems with latest coding practices and industry standards and modern design patterns and architectural principles; actively code and execute against the roadmap
- Develop high quality, well documented and efficient code adhering to all applicable Prudential standards
- Conduct complex data analysis and report on results, prepare data for prescriptive and predictive modeling, combine raw information from different sources
- Collaborate with data analysts, scientists, and architects on data projects to enhance data acquisition, transformation, organization processes, data reliability, efficiency, and quality
- Write unit, integration tests and functional automation, researching problems discovered by quality assurance or product support, developing solutions to address the problems
- Bring an applied understanding of relevant and emerging technologies, begin to identify opportunities to provide input to the team and coach others, and embed learning and innovation in the day-to-day
- Work on complex problems in which analysis of situations or data requires an in-depth evaluation of various factors
- Use programming languages including but not limited to Python, R, SQL, Java, Scala, Pyspark/Apache Spark, Shell scripting
- Bachelor of Computer Science or Engineering or experience in related fields
- Experience in working with DevOps automation tools & practices; Knowledge of full software development life cycle (SDLC)
- Leverage diverse ideas, experiences, thoughts and perspectives to the benefit of the organization
- Knowledge of business concepts tools and processes that are needed for making sound decisions in the context of the company's business
- Ability to learn new skills and knowledge on an on-going basis through self-initiative and tackling challenges
- Excellent problem solving, communication and collaboration skills; enjoy learning new skills!
- Applied experience with several of the following:
- Programming Language: Python, R, SQL, Java, Scala, Pyspark/Apache Spark, Shell scripting
- Data Ingestion, Integration & Transformation: Moving data from multiple sources, formats, and volumes to analytics platforms through various tools. Preparing data for further analysis; transforming and mapping raw data to generate insights and wrangling data through tools.
- Database Management System: Storing, organizing, managing, and delivering data using relational DBs, NoSQL DBs, Graph DBs, and data warehouse technologies including AWS Redshift and Snowflake
- Database tools: Data architecture to store, organize, and manage data. Experience with SQL and NoSQL based databases for storage and processing of structured, semi-structured & unstructured data.
- Real-Time Analytics: Spark, Kinesis Data Streams
- Data Buffering: Kinesis, Kafka
- Workflow Orchesration: Airflow, AppFlow, Austosys, Cloudwatch, Splunk
- Data Visualization: Tableau, Power BI, MS Excel
- Data Lakes & Warehousing: Building Data Models, Data Lakes and Data Warehousing
- Data Protection and Security: Knowledge of data protection, security principles and services; data loss prevention, role based access controls, data encryption, data access capture and core security services
- Common Infrastructure as Code (IaC) Frameworks: Ansible, CloudFormation
- Cloud Computing: Knowledge of fundamentals of AWS architectural principles and services; Strong ability on cloud formation and to write code; Knowledge of AWS core services
- Testing/Quality: Unit, interface and end user testing concepts and tooling inclusive of non-functional requirements (performance, usability, reliability, security/vulnerability scanning, etc.) including how testing integrated into Dev Ops; accessibility awareness
- Preferred qualifications:
- Serverless data pipeline development using AWS Glue, Lambda and Step functions
- Other Certifications?