Zoom is looking to build, optimize, and maintain data pipelines using modern big data technologies to handle high-volume data and support AI, and enable data-driven decision-making across the organization.
Requirements
- Expertise in ETL design, development, and data warehousing concepts.
- Expertise in SQL by writing and optimizing complex queries.
- Deep expertise in Python programming, with a focus on object-oriented principles.
- Experience with distributed data processing frameworks like Apache Spark and Apache Flink.
- Experience with AWS or other major cloud platforms.
- Proficiency in Linux/Unix shell scripting.
- Solid understanding of the big data ecosystem, including tools like Hive, Presto, Hadoop, HDFS, Parquet, and ORC.
Responsibilities
- Designing, developing, and maintaining ETL processes and data pipelines for structured and unstructured data. Also optimizing data warehouses and analytical data models.
- Writing and tuning complex SQL queries to support analytics, reporting, and application needs.
- Implementing distributed data processing solutions using Apache Spark and Apache Flink with Scala or Python.
- Building streaming and messaging solutions using Apache Kafka.
- Working with and optimize solutions in the big data ecosystem (e.g., Hive, Presto, HDFS, Parquet, ORC).
- Deploying and manage workloads using Kubernetes (K8s) in cloud or hybrid environments.
- Developing scripts in Linux/Unix shell for automation and data processing tasks.
Other
- Demonstrate Bachelor’s in Computer Science, Data Engineering, or a related field, OR 1–3 years of relevant professional experience.
- Collaborating with stakeholders to translate business thinking into data transformations and workflows.
- Ability to work in a hybrid or remote environment.
- Must be willing to work in a fast-paced environment.
- Commitment to fair hiring practices and accommodation requests.