The Coca-Cola Company's Global Equipment Platforms (GEP) team is seeking a Lead Data Engineer to build and maintain the data backbone for its global fleet of 17MM+ connected equipment, transforming raw telemetry data into a strategic asset for market insights, predictive analytics, and operational efficiencies.
Requirements
- Expert-level proficiency in designing, building, and operating data pipelines and data solutions in Microsoft Azure (e.g., Azure Data Factory, Azure Synapse Analytics, Azure Databricks, Event Hubs, IoT Hub, ADLS Gen2).
- Deep experience with real-time data streaming architectures and technologies (e.g., Kafka, Azure Event Hubs/IoT Hub).
- Strong programming skills in Python (preferred), Scala, or Java; expert in SQL.
- Extensive experience with big data technologies and distributed computing frameworks (e.g., Spark).
- Solid understanding of data warehousing concepts, dimensional modeling, and data lake architectures.
- Proven track record of implementing robust data quality and security measures.
- Experience with CI/CD practices for data pipelines and infrastructure as code (e.g., Terraform, ARM templates).
Responsibilities
- Design, develop, and maintain highly scalable, secure, and resilient data pipelines (batch, streaming, real-time) and data platforms on Microsoft Azure (e.g., Azure Data Factory, Azure Synapse Analytics, Azure Databricks, Event Hubs, IoT Hub, ADLS Gen2, Cosmos DB, Azure SQL).
- Implement robust data ingestion processes to collect high-volume telemetry data from 17MM+ connected devices, ensuring data quality and reliability at source.
- Develop efficient data transformation logic and data models to rationalize, cleanse, and enrich raw equipment data, making it ready for consumption by analytics and AI applications.
- Optimize data storage solutions (data lakes, data warehouses) for performance, cost-efficiency, and accessibility.
- Design and build scalable, resilient data pipelines for real-time telemetry, ensuring data quality and accessibility that directly fuels the analytical models developed by Data Scientists and the production AI systems deployed by the Lead AI Engineer.
- Engineer the data foundation required to support advanced analytics, machine learning models (e.g., predictive maintenance, demand forecasting, personalization), and AI Agents.
- Work closely with Data Scientists and Data Analysts to understand their data needs, ensure data quality, and optimize data structures for efficient model training and inference.
Other
- Bachelor's degree in Computer Science, Engineering, Information Systems, or a related quantitative field. Master's degree preferred.
- 10+ years of hands-on experience in data engineering, with a strong focus on building and operating large-scale data platforms.
- Collaborate closely with the Global Product Owner for Unified IoT, GEP Hardware/Software Engineering, Enterprise Digital Technology Solutions (IT), and Experience Design teams to ensure seamless data integration from connected devices.
- Work with equipment OEMs (e.g., Lancer, Cornelius, True, Imbera) to integrate their telemetry systems and ensure data capture adheres to TCCC's standards.
- Possesses deep, hands-on expertise in data engineering principles, tools, and best practices, continuously expanding knowledge.