Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Sanity Logo

Sr. Data Engineer, New Venture

Sanity

$210,000 - $265,000
Oct 28, 2025
Remote, US
Apply Now

Sanity.io is looking to solve the challenge of enabling machines to truly understand and use human-created content by building systems that structure and enrich large volumes of information for AI agents and LLMs.

Requirements

  • 5+ years of data engineering experience, with at least 2 years focused on AI/ML data pipelines or supporting machine learning workloads.
  • High level of proficiency in Python and SQL.
  • Strong experience with distributed data processing frameworks like Apache Spark, Dask, or Ray.
  • Proficiency with GCP and their data services.
  • Experience with real-time data streaming technologies like Kafka, Redpanda or NATS.
  • Familiarity with vector databases (e.g., Milvus, ElasticSearch, Vespa) and their role in AI applications.
  • Experience with data modeling, schema design, and working with both relational and NoSQL databases (PostgreSQL, MongoDB, Cassandra).

Responsibilities

  • Design, build, and optimize scalable data pipelines for AI and ML workloads, handling large volumes of structured and unstructured content data.
  • Architect data processing systems that transform, enrich, and prepare content for LLM consumption, with a focus on latency optimization and cost efficiency.
  • Build ETL/ELT workflows that extract, transform, and load data from diverse sources to support real-time and batch AI operations.
  • Implement data quality monitoring and observability systems to ensure pipeline reliability and data accuracy for AI models.
  • Collaborate with engineers and product teams to understand data requirements and design optimal data architectures that support AI features.
  • Optimize data storage strategies across data lakes, warehouses, and vector databases to balance performance, cost, and scalability.
  • Build automated data validation and testing frameworks to maintain data integrity throughout the pipeline.

Other

  • Based in the San Francisco Bay Area and able to work at least 2 days per week in our San Francisco office.
  • Strong focus on performance optimization, cost management, and building systems that scale efficiently.
  • Ability to write clean, well-documented, maintainable code with proper testing practices.
  • Excellent problem-solving skills and a data-driven approach to decision making.
  • Strong communication skills and ability to collaborate effectively with cross-functional teams.