Design, build, and optimize modern data platforms that power advanced analytics and AI solutions, enabling clients to accelerate digital transformations, adopt AI responsibly, and achieve measurable business outcomes.
Requirements
- Proficiency in Python, Scala, or Java for production-grade pipelines, with strong skills in SQL and PySpark
- Hands-on experience with cloud platforms such as (AWS, GCP, Azure, Oracle) and modern data storage/warehouse solutions such as Snowflake, BigQuery, Redshift, and Delta Lake
- Practical experience with Databricks, AWS Glue, and transformation frameworks like dbt, Dataform, or Databricks Asset Bundles
- Knowledge of distributed systems such as (Spark, Dask, Flink) and streaming platforms (Kafka, Kinesis, Pulsar) for real-time and batch processing
- Familiarity with workflow orchestration tools such as (Airflow, Dagster, Prefect), CI/CD for data workflows, and infrastructure-as-code (Terraform, CloudFormation)
- Understanding of DataOps principles including pipeline monitoring, testing, and automation, with exposure to observability tools such as Datadog, Prometheus, and Great Expectations
- Exposure to ML platforms such as (Databricks, SageMaker, Vertex AI), MLOps best practices, and GenAI toolkits (LangChain, LlamaIndex, Hugging Face)
Responsibilities
- Develop a streaming data platform to integrate telemetry for predictive maintenance in aerospace systems
- Implement secure data pipelines that reduce time-to-insight for a Fortune 500 utility company
- Optimize large-scale batch and streaming workflows for a global financial services client, cutting infrastructure costs while improving performance.
- Develop pipelines for embeddings and vector databases to enable retrieval-augmented generation (RAG) for a global defense client.
- Design, build, and optimize modern data platforms that power advanced analytics and AI solutions.
- Collaborate with clients and interdisciplinary teams to architect scalable pipelines, manage secure and compliant data environments, and unlock the value of complex datasets across industries.
- Work in cross-functional Agile teams with Data Scientists, Machine Learning Engineers, Designers, and domain experts to deliver high-quality analytics solutions.
Other
- 2+ years of professional experience in data engineering, software engineering, or adjacent technical roles
- Willingness to travel as required
- Strong communication, time management, and resilience, with the ability to align technical solutions to business value
- Degree in Computer Science, Business Analytics, Engineering, Mathematics, or related field
- You are the kind of person who thrives in a high performance/high reward culture - doing hard things, picking yourself up when you stumble, and having the resilience to try another way forward.