Implement AI data pipelines that bring together structured, semi-structured and unstructured data to support AI and Agentic solutions.
Requirements
8+ years of data engineering experience including Data solutions, SQL and NoSQL, Snowflake, ETL/ELT tools, CICD, Bigdata, Cloud Technologies (AWS/Google/AZURE), Python/Spark, Datamesh, Datalake or Data Fabric.
3+ years of AI/ML experience, with 1+ years of data engineering experience focused on supporting Generative AI technologies.
Strong hands-on experience implementing production ready enterprise grade AI data solutions.
Experience with prompt engineering techniques for large language models.
Experience in implementing Retrieval-Augmented Generation (RAG) pipelines, integrating retrieval mechanisms with language models.
Experience of vector databases and graph databases, including implementation and optimization.
Proficiency in implementing scalable AI driven data systems supporting agentic solution (AWS Lambda, S3, EC2, Langchain, Langgraph).
Responsibilities
Develop AI-driven systems to improve data capabilities, ensuring compliance with industry best practices.
Implement efficient Retrieval-Augmented Generation (RAG) architectures and integrate with enterprise data infrastructure.
Design, build and maintain scalable and robust real-time data streaming pipelines using technologies such as Apache Kafka, AWS Kinesis, Spark streaming, or similar.
Develop data domains and data products for various consumption archetypes including Reporting, Data Science, AI/ML, Analytics etc.
Ensure the reliability, availability, and scalability of data pipelines and systems through effective monitoring, alerting, and incident management.
Implement best practices in reliability engineering, including redundancy, fault tolerance, and disaster recovery strategies.
Develop graph database solution for complex data relationships supporting AI systems.
Other
This role will have a Hybrid work schedule, with the expectation of working in an office (Columbus, OH, Chicago, IL, Hartford, CT or Charlotte, NC) 3 days a week (Tuesday through Thursday).
Collaborate with cross-functional teams to integrate solutions into operational processes and systems supporting various functions.
Mentor junior team members and engage in communities of practice to deliver high-quality data and AI solutions while promoting best practices, standards, and adoption of reusable patterns.
Partner with architects and stakeholders to influence and implement the vision of the AI and data pipelines while safeguarding the integrity and scalability of the environment.
Candidate must be authorized to work in the US without company sponsorship. The company will not support the STEM OPT I-983 Training Plan endorsement for this position.