Modernize and enhance content delivery systems using AI-enabled solutions, specifically focusing on Natural Language Processing (NLP) and retrieval-augmented generation (RAG) architectures.
Requirements
- At least 5 years of hands-on experience in data science, focused on natural language processing (NLP)
- At least 5 years of experience using Python, with expertise in NLP libraries such as LangChain, LangGraph, or other “Lang”-based toolkits
- Proven experience in model development and applying machine learning techniques to real-world problems
- Expertise in retrieval-based LLM workflows (RAG, VRAG, GraphRAG)
- Deep understanding of embedding models, semantic search, and vector stores (e.g., FAISS, Pinecone)
- Experience with document loaders and text splitters/document splitting strategies
- Familiarity with MLOps practices and production-level deployment of AI pipelines
Responsibilities
- Design NLP Workflows: Develop scalable pipelines for text ingestion, cleaning, normalization, and tokenization to support downstream applications.
- Implement Indexing and Vectorization Strategies: Architect and maintain robust indexing systems and vector databases for semantic search and retrieval.
- Develop Prompting and Finetuning Frameworks: Create reusable prompting strategies and lead fine-tuning initiatives for LLMs tailored to business-specific tasks.
- Build LangChain/LangGraph Applications: Construct dynamic knowledge systems and agentic workflows using LangChain and LangGraph.
- Integrate Advanced RAG Architectures: Apply VRAG and GraphRAG design patterns to enrich information retrieval and contextual understanding.
- Conduct Performance Optimization: Perform benchmark testing and model evaluations to improve accuracy, efficiency, and scalability of NLP systems.
- Collaborate Across Teams: Work closely with engineering, product, and research stakeholders to deliver integrated AI-driven features.
Other
- Bachelor’s degree in Computer Science, Data Science, Computational Linguistics, or a related field
- Provide Technical Leadership: Mentor junior data scientists, guide best practices, and drive innovation across AI projects.
- Fully remote position based in the US
- Primarily working core business hours in your time zone, with flexibility to adjust to various global time zones as needed