OpenAI's Business Data Science & Analytics team needs to build data pipelines and core marketing datasets to understand and optimize marketing and partnership performance, measure ROI, guide investment decisions, and accelerate product adoption.
Requirements
- Proficient in Python, Scala, or Java for data engineering.
- Have experience with distributed processing technologies (e.g., Hadoop, Flink) and distributed storage systems (e.g., HDFS, S3).
- Are skilled with ETL orchestration tools such as Airflow, Dagster, or Prefect.
- Have a solid understanding of Spark, including writing, debugging, and optimizing Spark code.
- Bring familiarity with marketing data sources (e.g., ad platforms, attribution systems, CRM, web analytics).
Responsibilities
- Design, build, and manage pipelines that integrate marketing, and partnership data into our data warehouse.
- Develop canonical datasets to track key business metrics such as spend, LTV, CAC, ROI, and incremental performance.
- Implement robust systems for data ingestion and processing across multiple channels.
- Participate in data architecture and engineering decisions that define the foundation for marketing analytics.
- Ensure the security, integrity, and compliance of data according to industry and company standards.
Other
- 3+ years of experience as a Data Engineer and 8+ years of overall software engineering experience (including data engineering).
- Partner with Marketing, Partnerships, Data Science, Finance, and Product teams to understand data needs and deliver scalable solutions.
- Thrive in ambiguity, love to build from 0→1, and want your work to directly shape how OpenAI grows.
- We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.