Migrate multiple analytics and machine learning applications from a legacy SQL environment to Amazon Redshift and standardize codebases on a modern Python architecture.
Requirements
Strong knowledge of Python project structures, dependency management, and packaging tools (pip, poetry, conda).
Experience migrating applications from legacy SQL databases to cloud data warehouses (Redshift, Snowflake, BigQuery), ensuring data consistency.
Proficiency in SQL and experience optimizing queries for cloud warehouses.
Demonstrated ability to write robust tests (pytest/unittest) and integrate them with CI/CD pipelines.
Familiarity with containerization, orchestration, and workflow tools such as Docker, Kubernetes, Airflow, or Step Functions.
Experience with dbt-modeled data warehouses and collaboration with analytics engineers.
Knowledge of MLOps tools, model validation frameworks, and feature stores.
Responsibilities
Review existing Python applications to map dependencies, data access patterns, configuration, and deployment processes.
Transition data pipelines to pull from Redshift while eliminating legacy SQL dependencies.
Standardize code organization, packaging, configuration, logging, and containerization according to a modern reference framework.
Develop unit and integration tests for data ingestion, transformations, and model outputs, integrating them into CI/CD pipelines.
Document code, add clear type hints, improve readability, and produce operational runbooks for all applications.
Update deployment pipelines using containerization and orchestration tools to ensure repeatable, automated releases.
Provide guidance and training to engineers on modern development standards, testing practices, and Redshift integration.
Other
7+ years of professional experience developing production ML or analytics applications in Python.
Strong documentation skills and ability to coach other engineers on sustainable development practices.
Remote (U.S. time zones; must overlap ?4 hours with U.S. Central Time)