Design, build, and operationalize intelligent data pipelines that power Large Language Model (LLM) and Generative AI systems, bridging the gap between data engineering and AI innovation for secure, scalable, and high-performance integration of enterprise data with advanced language models.
Requirements
- Proven experience as a Data Engineer or ML Engineer integrating LLM or Generative AI systems.
- Proficiency in Python, SQL, and distributed data frameworks such as Spark or DataBricks.
- Strong understanding of RAG architectures and vector databases (e.g., Pinecone, Weaviate, Chroma, FAISS).
- Experience with orchestration frameworks such as LangChain, LlamaIndex, or Semantic Kernel.
- Understanding of AI security, data privacy, and prompt injection defenses.
- Experience working with Azure DataBricks, Azure AI Services, or Azure OpenAI.
- Experience fine-tuning or customizing LLMs for enterprise use cases.
Responsibilities
- Design and optimize data pipelines serving LLM and Generative AI applications.
- Integrate Generative AI systems (e.g., OpenAI, Azure OpenAI, Anthropic, LLaMA, Mistral) with curated enterprise data sources.
- Develop and maintain retrieval-augmented generation (RAG) pipelines connecting structured/unstructured data to AI model contexts.
- Implement agentic system architectures using frameworks like LangChain, Semantic Kernel, or LlamaIndex.
- Apply best practices in AI security, data governance, and compliance to ensure responsible AI development.
- Automate LLM evaluation, fine-tuning, and deployment workflows while maintaining high system availability and accuracy.
- Collaborate with data scientists, ML engineers, and AI researchers to ensure data readiness and model efficiency.
Other
- Position Type: Contract (6 Months, likely to extend or convert).
- Work Location: Onsite - Columbus, OH (Local or willing to relocate Day 1).
- Work Authorization: Must be authorized to work in the United States without sponsorship.
- Strong collaboration, problem-solving, and communication skills.
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).