Lead the development and deployment of cutting-edge generative AI solutions on AWS.
Requirements
- Must have proven experience in building and deploying generative AI models (such as LLMs, GANs, diffusion models).
- Must have deep expertise in AWS AI/ML stack: SageMaker, Bedrock, EC2, Lambda, S3, CloudWatch, IAM, etc.
- Must have strong programming skills in Python (TensorFlow, PyTorch, Transformers, LangChain, etc.).
- Must have experience with MLOps tools (e.g., SageMaker Pipelines, MLflow, Kubeflow).
- Must be familiar with Retrieval-Augmented Generation (RAG) using AWS-native services and frameworks.
- Must have strong understanding of AWS architecture, networking, and security best practices.
- Must have experience in creating a RAG system for document search using Amazon Kendra and LangChain.
Responsibilities
- Design, develop, and deploy generative AI models (e.g., LLMs, diffusion models) using AWS services.
- Architect scalable, secure, and cost-effective ML infrastructure on AWS (e.g., Sagemaker, Lambda, ECS, EKS).
- Fine-tune foundation models (e.g., LLaMA, Falcon, Claude, Mistral) using Amazon SageMaker or custom pipelines.
- Implement real-time inference pipelines using AWS services such as SageMaker Endpoints, Lambda, and API Gateway.
- Evaluate and integrate third-party APIs (e.g., Bedrock, OpenAI, Anthropic) with custom workflows.
- Optimize model training and inference performance using GPU-based instances (e.g., EC2 P4d, G5) and distributed training.
- Ensure compliance with security, privacy, and governance policies when deploying models in AWS environments.
Other
- Collaborate with data scientists, ML engineers, and DevOps to operationalize AI models into production.
- Stay up to date with the latest in generative AI research, tools, and AWS innovations.
- Excellent communication and documentation skills.
- This is a remote position.