The company is looking to design, build, and optimize the data infrastructure that powers their Near Real Time Processing, requiring scalable data systems, reliable data delivery, and efficient data management.
Requirements
- Strong expertise in SQL and data modeling.
- Proficiency in at least one programming language (Python, Java, or Scala).
- Experience with cloud platforms (AWS, GCP, or Azure) and modern data stacks (Snowflake, BigQuery, Databricks, Redshift, etc.).
- Hands-on experience with distributed data systems (Spark, Kafka, Flink, or similar).
- Knowledge of data governance, security, and compliance best practices.
- Familiarity with CI/CD and infrastructure-as-code (Terraform, CloudFormation).
- Exposure to machine learning workflows and MLOps practices.
Responsibilities
- Design, develop, and maintain scalable ETL/ELT pipelines.
- Build robust data models and warehouses to support analytics and reporting.
- Implement data quality checks, monitoring, and governance frameworks.
- Optimize database queries and data pipelines for performance and cost efficiency.
- Ensure high availability and reliability of data infrastructure.
- Stay up to date with emerging data engineering technologies and evaluate their applicability.
- Work closely with data scientists, analysts, and software engineers to ensure smooth data flow across systems.
Other
- 100% Remote
- Full-time
- Provide technical mentorship to junior engineers and contribute to best practices.
- Partner with product and business stakeholders to understand data needs and translate them into solutions.
- Excellent problem-solving skills and ability to communicate complex ideas clearly.