The company is looking to implement and operationalize Generative AI (GenAI) solutions, specifically focusing on Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) applications in a production environment. This involves ensuring the reliability, scalability, and safety of these AI workloads.
Requirements
- 10+ yrs of experience designing and developing AI solution architecture to scale
- Proven Hands-on experience in GenAI Ops-Operationalizing LLM and RAG applications in production
- Strong hands-on experience with the Langchain framework
- Experience specifically with the Open AI API, chat completions, embeddings,etc
- Have a solid awareness on TensorRT and VLLM implementation.
- Strong proficiency in python and data Science libraries(NumPy,Pandas,scikit-learn,PyTorch/TensorFlow)
- Proven experience applying guardrails and observability to LLM or RAG-powered applications.
Responsibilities
- Implement Guardrails and observability across RAG and LLM applications
- Setup GenAI Ops workflows to continuously monitor inference latency, throughput, quality and safety metrics.
- Define,track,and analyze RAG guardrail metrics using LLMs as Judges and SMLs(e.g. attribution, grounding,prompt injection,tone,PII leekage)
- Implement annotation,structured feedback loops,fine-tuning, and alignment methods to calibrate judge models
- Use langchain to orchestrate guardrail checks, manage prompt versioning and integrate judge model scoring workflows.
- Work with Openshift to deploy,scale and monitor containerized genAI services.
- Build observability dashboards and alerts(Grafana or equivalent) for AI reliability.
Other
- Bachelors or Masters degree in Data Science, Computer Science, MIS ,related field, or equivalanet experience
- Duration- 12 Months