
AI Data Engineer - Hybrid
- Hartford, CT
- $100,960-151,440 per year
- Permanent
- Full-time
- Design, develop, and implement complex data pipelines for AI/ML, including those supporting RAG architectures, using technologies such as Python, Snowflake, AWS, GCP, and Vertex AI.
- Implement on end-to-end generative AI pipelines, from data ingestion to pipeline deployment and monitoring.
- Build and maintain data pipelines that ingest, transform, and load data from various sources (structured, unstructured, and semi-structured) into data warehouses, data lakes, vector databases (e.g., Pinecone, Weaviate, Faiss - consider specifying which ones you use or are exploring), and graph databases (e.g., Neo4j, Amazon Neptune - same consideration as above).
- Develop and implement data quality checks, validation processes, and monitoring solutions to ensure data accuracy, consistency, and reliability.
- Implement end-to-end generative AI data pipelines, from data ingestion to pipeline deployment and monitoring.
- Develop complex AI systems, adhering to best practices in software engineering and AI development.
- Work with cross-functional teams to integrate AI solutions into existing products and services.
- Keep up-to-date with AI advancements and apply new technologies and methodologies to our systems.
- Assist in mentoring junior AI/data engineers in AI development best practices.
- Implement and optimize RAG architectures and pipelines.
- Develop solutions for handling unstructured data in AI pipelines.
- Implement agentic workflows for autonomous AI systems.
- Develop graph database solutions for complex data relationships in AI systems.
- Integrate AI pipelines with Snowflake data warehouse for efficient data processing and storage.
- Apply GenAI solutions to insurance-specific use cases and challenges.
- Candidates must be authorized to work in the US without company sponsorship. The company will not support the STEM OPT I-983 Training Plan endorsement for this position.
- Bachelor's in Computer Science, Artificial Intelligence, or a related field.
- 2+ years of experience in data engineering
- Experience with ETL tools (Informatica, IDMC, Talend etc.) & awareness of Big data tech stack - Hadoop, EMR & Pyspark
- Advanced knowledge of SQL as it pertains to data & analytics on any relational database Oracle, SQL Server, Snowflake etc.
- Awareness of data engineering, with at least some hands on with generative AI technologies.
- Ability to showcase implementation of production-ready enterprise-grade GenAI pipelines.
- Experience & awareness of prompt engineering techniques for large language models.
- Experience & awareness in implementing Retrieval-Augmented Generation (RAG) pipelines, integrating retrieval mechanisms with language models.
- Knowledge of vector databases and graph databases, including implementation and optimization.
- Experience & awareness in processing and leveraging unstructured data for GenAI applications.
- Proficiency in implementing agentic workflows for AI systems.