GoGuardian is looking to design, build, and continuously improve its Analytics and AI/ML ecosystem by hiring a Data Engineer II to enhance its core data platform.
Requirements
- Proficiency in Python and SQL, with experience in PySpark, pandas, or similar data processing frameworks.
- Experience with DBT
- Experience with modern data warehousing and lakehouse platforms, preferably Databricks.
- Hands-on experience with workflow orchestration tools such as Airflow, Dagster, or Prefect.
- Strong understanding of data modeling, ETL design, and distributed data systems.
- Experience with AWS data and compute services (S3, Lambda, ECS, CloudWatch, etc.) or equivalent cloud platforms.
- Familiarity with MLOps concepts (e.g., feature stores, model registries, CI/CD for ML).
Responsibilities
- Design, build, and optimize ETL pipelines that power analytics, data science, and ML workflows using tools such as Databricks, PySpark, and Airflow.
- Develop and maintain labeling and retraining pipelines for machine learning models, ensuring quality, reproducibility, and observability.
- Implement and support MLOps practices, including model versioning, CI/CD for ML, and model monitoring in production environments.
- Collaborate with data scientists to productionize and scale model training, inference, and evaluation pipelines.
- Contribute to the design and evolution of the data lakehouse, including schema design, partitioning strategies, and performance optimization.
- Document and communicate data architecture, lineage, and dependencies to ensure transparency and maintainability across teams.
- Champion data quality and governance, ensuring that datasets are accurate, well-structured, and compliant with organizational standards.
Other
- 2–4 years of experience building and operating large-scale data systems, ideally supporting analytics and ML workloads.
- Experience using Infrastructure as Code, preferably Terraform.
- Excellent problem-solving, collaboration, and communication skills; comfortable working in a dynamic, fast-paced environment.
- Bachelor’s degree in Computer Science, Engineering, or related field.
- Remote work