Datadog is looking to solve complex problems in cloud observability and security through AI research initiatives.
Requirements
- PhD in Computer Science, Machine Learning, or a related field
- Extensive experience in designing and implementing deep learning models
- Strong background in distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and ML libraries (PyTorch, TensorFlow)
- Proven track record of conducting impactful research in the field with publications at top-tier venues
- Familiarity with efficient training, fine-tuning, and inference techniques for large foundation models
- Experience with GPU programming and optimization, including experience in CUDA (bonus)
- Experience writing production data pipelines and applications (bonus)
Responsibilities
- Conduct cutting-edge research in Generative AI and Machine Learning
- Leverage large-scale distributed training infrastructure to train and fine-tune state-of-the-art models on diverse, real-world telemetry data
- Lead and contribute to research publications, present findings at top-tier conferences
- Collaborate with cross-functional teams to integrate advanced AI capabilities into Datadog’s product ecosystem
- Stay at the forefront of LLMs, Foundation Models, and Generative AI research
- Foster a culture of scientific rigor, innovation, and practical impact
Other
- Hold a PhD in Computer Science, Machine Learning, or a related field
- Excel at explaining complex models and research findings to both technical and non-technical audiences
- Strong interest in open-science and open-source contributions
- Competitive global benefits
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
- Opportunity to collaborate closely with colleagues across the Datadog offices in New York City and Paris
- Opportunity to attend and present at conferences and meetups
- Intra-departmental mentor and buddy program for in-house networking
- An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)