Demandbase is looking to connect with talented Senior Data Engineers to build and optimize large-scale data systems for their Identification product, which is a critical component of their account intelligence platform. This involves improving core pipelines and designing new processes to support data science teams in testing and deploying ML/AI models.
Requirements
- Object-oriented / strongly typed programming (Scala, Java, etc.)
- Productionizing and deploying Spark pipelines
- Complex SQL
- Apache Airflow or similar orchestration tools
- Strong SDLC principles (CI/CD, unit testing, Git process, etc.)
- Solid understanding of AWS services (IAM, EC2, S3)
- An interest in data science
Responsibilities
- Lead initiatives to build, expand, and improve real-world entity identification datasets
- Design and build new pipelines to increase identification coverage and detect errors
- Collaborate with a skilled data science team to enable new ML/AI model development
- Provide insights into optimizing existing pipelines for performance and cost-efficiency
- Create and document descriptive plans for new feature implementation
- Coordinate with downstream stakeholders with dependencies on identification datasets
Other
- 8+ years of relevant experience
- Bachelor’s degree in computer science, engineering, mathematics, or related field
- Experience with Python, distributed computing, ad-targeting, or GenAI
- Background in the ad-tech industry
- Experience modeling and working with graph-based datasets