Vast.ai is looking to build and own the end-to-end data platform to organize, optimize, and orient the world's computation, democratizing and decentralizing AI computing.
Requirements
- 3+ years (typically 3–6) in a Data Engineering role building production ELT/ETL on a cloud platform (AWS strongly preferred).
- Expert SQL and solid Python for data processing/automation.
- Proven experience designing data models (staging, marts, star schemas) and standing up a warehouse/lakehouse.
- Orchestration, scheduling, and operational ownership (SLAs, alerting, runbooks).
- Experience enabling a BI layer (ideally QuickSight) with secure, governed datasets.
- AWS: S3, Glue/Athena or Redshift, Lambda/Step Functions, IAM/KMS
- Orchestration & Modeling: Airflow or Dagster; dbt (or equivalent SQL modeling)
Responsibilities
- Own the data pipeline: design, build, and operate batch/streaming ingestion from product, billing, CRM, support, and marketing/ad platforms into a central warehouse.
- Model the data: create clean, well-documented staging and business marts (dimensional/star schemas) that map to the needs of Marketing, Sales, Accounting/Finance, and Operations.
- Enable: publish certified datasets with row-/column-level security, manage refresh SLAs, and make it easy for teams to self-serve.
- Collaborate cross-functionally: intake requirements, translate them into data contracts and models, and partner with Engineering on event/telemetry capture.
- Document & scale: maintain clear docs, lineage, and a pragmatic data catalog so others can discover and trust the data.
- Lead the next step-function in maturity using a pragmatic, AWS-centric stack.
- Automated ingestion + core staging tables with data quality checks and alerts.
Other
- On-site at our office in Westwood, Los Angeles
- Full-time • On-site • Immediate start preferred
- Strong intrinsic drive, a true passion for uncovering insights from data, and a mix of analytical, programming, and communication skills.
- Strong collaboration and communication; able to gather requirements from non-technical stakeholders and translate to data contracts.
- Ambitious, fast-paced startup culture where initiative is rewarded