Demandbase is looking to improve the core pipelines that power its Identification product and design new processes to enable the data science team to test and deploy new ML/AI models.
Requirements
- Object oriented / strongly typed programming expertise - Scala, Java, etc.
- Productionizing and deploying Spark pipelines
- Complex SQL
- Apache Airflow or a similar orchestration tool
- Strong SDLC principles (CI/CD, unit testing, Git process, etc.)
- Strong understanding of AWS services (IAM, EC2, S3)
- Python, Distributed Computing, Ad-Targeting, GenAI
Responsibilities
- Lead initiatives to build, expand, and improve real-world entity identification datasets
- Design and build new pipelines to increase identification coverage and detect errors
- Collaborate with a skilled data science team to enable new ML/AI model development
- Provide insights into optimizing existing pipelines for performance and cost-efficiency
- Create and document descriptive plans for new feature implementation
- Coordinate with downstream stakeholders with dependencies on identification datasets
Other
- Bachelor’s degree in computer science, engineering, mathematics, or related field
- 8+ years of relevant experience
- Flexible PTO policy, 15 paid holidays in 2025—including a three-day break around July 4th and a full week off for Thanksgiving—and No Internal Meetings Fridays
- Comprehensive benefits package designed to support your health, well-being, and financial security