MARA is redefining the future of sovereign, energy-aware AI infrastructure and is seeking a Lead Software Engineer to design, build, and scale systems that power agentic and intelligent workloads across their product ecosystem.
Requirements
- Proficiency in Python, with strong understanding of ML toolchains (PyTorch, Hugging Face, LangChain, MLflow, Ray, etc.).
- Proven experience with model evaluation, fine-tuning, and deployment across cloud and on-prem environments.
- Hands-on experience with RAG architectures and vector databases (Weaviate, Milvus, pgvector, LanceDB, FAISS).
- Deep understanding of prompt design, orchestration, and versioning using CI/CD workflows and automated testing frameworks.
- Familiarity with agentic systems, both code-driven and visual-builder interfaces (LangGraph Studio, Dust, Flowise, Relevance AI, etc.).
- Strong knowledge of guardrail techniques (rule-based filters, policy evaluators, toxicity detection, grounding validation).
- Experience deploying ML systems on Kubernetes and serverless environments with observability (Prometheus, Grafana, OpenTelemetry).
Responsibilities
- Lead architecture and development of agentic platforms that integrate multiple models, tools, and knowledge sources into dynamic reasoning systems.
- Evaluate and deploy foundation and open-source models (LLMs, vision, multimodal) using efficient inference strategies and fine-tuning where applicable.
- Design and maintain prompt lifecycle pipelines with version control, testing, and CI/CD integration ("PromptOps").
- Build and optimize RAG systems—vector database configuration, retriever-generator orchestration, and embedding quality improvement.
- Implement guardrail frameworks for content safety, hallucination control, and policy enforcement across agentic workflows.
- Integrate and extend agentic frameworks (LangChain, LangGraph, CrewAI, AutoGen, or equivalent), both in code-based and visual orchestration environments.
- Define observability and evaluation metrics for model performance, latency, and behavior drift in production.
Other
- 8+ years of professional software engineering experience, including 3+ years in ML application development or AI platform engineering.
- Excellent communication and leadership skills, with ability to translate complex ML concepts into actionable engineering outcomes.
- Background in HPC, ML infrastructure, or sovereign/regulated environments (preferred).
- Familiarity with energy-aware computing, modular data centers, or ESG-driven infrastructure design (preferred).
- Experience collaborating with European and global engineering partners (preferred).