Developing the next generation of agentic AI systems, including autonomous exception resolution, anomaly detection, and explainable insights for the client's expanding R&D and Applied AI team
Requirements
- 3 to 5+ years building and deploying ML systems
- Python and libraries: PyTorch, TensorFlow, Scikit-Learn, Hugging Face Transformers
- 2+ years of hands-on experience with LLMs / SLMs: fine-tuning, prompt engineering, inference optimization
- Experience with at least two: OpenAI GPT, Anthropic Claude, Google Gemini, Meta LLaMA
- Vector databases, embeddings, and RAG pipelines
- Skilled with structured/unstructured data, SQL, and distributed frameworks (Spark, Ray)
- Solid understanding of the full ML lifecycle
Responsibilities
- Design, train, fine-tune, and deploy ML/LLM models for production
- Build RAG pipelines using vector databases
- Prototype and optimize multi-agent workflows using frameworks like LangChain, LangGraph, MCP
- Develop prompt engineering strategies, optimization, and safety techniques for agentic LLM interactions
- Integrate memory, evidence packs, and explainability modules into agentic pipelines
- Partner with Data Engineering to build and maintain real-time and batch data pipelines for ML/LLM workloads
- Implement model monitoring, drift detection, and retraining pipelines
Other
- Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or a related field
- Must be in the Southeast (preferably in Metro Atlanta) – with occasional trips to Atlanta office
- Not open to 3rd Party Candidates / Visa Sponsorship or Transfer is not available
- Must have strong communication and collaboration skills to work cross-functionally with R&D, Data Science, Product, and Engineering teams
- Must be willing to participate in design reviews, architecture discussions, and model evaluations