Grainger is looking to support the day-to-day operations of their data infrastructure, processes, and pipelines by ensuring platform stability, addressing operational issues, and meeting SLAs during production incidents.
Requirements
- 2+ years of experience in batch and streaming ETL using Spark, Python, Scala, Snowflake, or Databricks for data engineering or ML workloads.
- 2+ years of experience orchestrating and managing pipelines with workflow tools such as Databricks Workflows, Apache Airflow, or Luigi.
- 2+ years of experience designing, building, and optimizing CI/CD pipelines using tools such as GitHub and Jenkins.
- Experience in building observability and monitoring frameworks for data infrastructure
- Infrastructure as Code (IaC) tools such as Terraform and Ansible.
- DevOps or DataOps practices.
- AWS or other cloud services (e.g., AWS Glue, Athena, Lambda, S3).
Responsibilities
- Troubleshoot and resolve operational issues to ensure platform stability and timely recovery.
- Perform root cause analysis for production incidents; document findings and implement long-term, best-practice fixes.
- Improve operational efficiency by building alerts, observability dashboards, automation, and reducing redundancies.
- Partner with stakeholders including data, design, product, and executive teams and assisting them with data-related technical issues
- Stay current with industry trends and evaluate emerging technologies to enhance infrastructure and processes.
Other
- Hybrid work location type.
- Strong problem-solving skills with the ability to work independently under pressure
- Bachelor’s degree or equivalent experience
- Individuals requiring sponsorship (e.g. OPT or H1B visa status) should not apply. Only individuals authorized to work in the United States now and for the foreseeable future will be considered for this position.
- Preferred to be based in either Lake Forest, IL or downtown Chicago.