Point Wild is looking to solve the problem of customers monitoring, managing, and protecting against the risks associated with their identities and personal information in a digital world.
Requirements
- 5+ years in Data Engineering with strong experience building production data systems on Databricks
- Expertise in PySpark, SQL, and Python
- Strong knowledge of Delta Lake, Parquet, and lakehouse architectures.
- Experience with streaming frameworks (Structured Streaming, Kafka, Kinesis, or Pub/Sub)
- Familiarity with DBT for transformation and analytics workflows
- Strong understanding of data governance and security controls (Unity Catalog, IAM)
- Exposure to AI/ML data workflows (feature stores, embeddings, vector databases)
Responsibilities
- Build and optimize data ingestion pipelines on Databricks (batch and streaming) to process structured, semi-structured, and unstructured data.
- Implement scalable data models and transformations leveraging Delta Lake and open data formats (Parquet, Delta).
- Design and manage workflows with Databricks Workflows, Airflow, or equivalent orchestration tools.
- Implement automated testing, lineage, and monitoring frameworks using tools like Great Expectations and Unity Catalog.
- Build integrations with enterprise and third-party systems via cloud APIs, Kafka/Kinesis, and connectors into Databricks.
- Partner with AI/ML teams to provision feature stores, integrate vector databases (Pinecone, Milvus, Weaviate), and support RAG-style architectures.
- Optimize Spark and SQL workloads for speed and cost efficiency across multi-cloud environments (AWS, Azure, GCP)
Other
- Bachelor's or Master's degree in Computer Science, Engineering, or related field.
- Detail-oriented, collaborative, and comfortable working in a fast-paced innovation-driven environment
- Data Engineering experience in a B2B SaaS organization
- See your impact and accelerate your career in a fast-paced, growth-oriented environment
- Work with other talented people at a company where people matter