Cisco is seeking a Senior Engineering Manager to lead teams building, deploying, and optimizing Large Language Model (LLM)-based applications, with a strong emphasis on LLMOps, Retrieval-Augmented Generation (RAG) pipelines, and scalable production systems.
Requirements
- 8+ years of software engineering experience, with 3+ years in engineering management or technical leadership roles.
- Proven track record of shipping production-grade ML/LLM systems.
- Strong understanding of LLMs, fine-tuning, prompt engineering, vector databases (e.g., Pinecone, Weaviate, FAISS), and RAG patterns.
- Experience with cloud-native architectures (AWS, GCP, or Azure) and container orchestration (Kubernetes).
- Proficiency in Python and familiarity with AI/ML frameworks such as PyTorch, Transformers, LangChain, or similar.
- Experience managing or working with multi-modal or multi-agent systems.
- Exposure to regulatory or compliance frameworks for ML systems (e.g., GDPR, SOC 2).
Responsibilities
- Lead and grow a high-performing engineering team focused on LLM applications and infrastructure.
- Design and oversee scalable LLMOps pipelines including fine-tuning, evaluation, deployment, monitoring, and optimization of large language models.
- Oversee the design and implementation of RAG pipelines including vector database management, chunking strategies, embedding selection, retrieval tuning, and relevance evaluation.
- Own architectural decisions for high-availability, low-latency systems powering generative AI applications.
- Collaborate with infrastructure and DevOps teams on scaling inference workloads (e.g., with GPU clusters, model quantization, caching, and sharding).
- Champion model observability, incident response, prompt versioning, and feedback loops.
- Ensure responsible AI practices and data governance are followed.
Other
- Bachelor's, Master's, or Ph.D. degree in a relevant field.
- Travel may be required.
- Must be eligible to work in the U.S. and/or Canada.
- Strong communication and collaboration skills.
- Ability to work in a fast-paced environment and prioritize effectively.