Nebius is looking to solve the business problem of enabling customers to leverage serverless inference platforms for open-source LLMs across multiple modalities, by providing a Senior ML Solutions Architect to design and implement customized LLM-based solutions and architect scalable AI applications.
Requirements
- 5+ years of experience in ML/AI systems, with at least 2 years focused on LLMs and generative AI.
- Deep knowledge of the LLM ecosystem, including model architectures and fine-tuning approaches.
- Hands-on experience with: Prompt engineering and LLM pipeline development, including evaluation.
- Hands-on experience with: Agentic frameworks such as Langchain, Langsmith, smolagents, or equivalent.
- Hands-on experience with: Vector databases and RAG implementation patterns.
- Hands-on experience with: Deploying LLM-powered applications using APIs from OpenAI, Anthropic, or open-source models.
- Strong Python programming skills.
Responsibilities
- Design and implement LLM-based solutions using Nebius Token Factory’s inference services to drive business value and support customer goals.
- Build production-ready applications leveraging our serverless LLM APIs, including multimodal models (text, vision, audio) and domain-specific models.
- Provide technical expertise in prompt engineering, RAG architectures, model selection, and inference optimization.
- Collaborate with product and engineering teams to surface customer feedback and shape the platform roadmap.
- Guide customers in scaling from POC to production with a focus on performance, reliability, and cost efficiency.
Other
- Excellent communication skills, with the ability to clearly explain technical concepts to diverse audiences.
- Opportunities for professional growth within Nebius.
- Flexible working arrangements.
- A dynamic and collaborative work environment that values initiative and innovation.