The Wikimedia Foundation is looking to improve its data capabilities to serve both internal teams and the global community, and needs a Senior Data Engineer to shape the future of its data ecosystem.
Requirements
- Expertise in tools like Airflow, Kafka, Spark, and Hive
- Advanced proficiency in Python and Java/Scala
- Advanced working knowledge of SQL and experience with various database/query dialects
- Familiarity with additional technologies such as Flink, Iceberg, Druid, Presto, Cassandra, Kubernetes, and Docker
- Expertise in AI development tooling and AI applications in data engineering and analytics
- Familiarity with stream processing frameworks like Spark Streaming or Flink
Responsibilities
- Designing and Building Data Pipelines
- Monitoring and Alerting for Data Quality
- Supporting Data Governance and Lineage
- Data Platform Development
- Enhancing Operational Excellence
Other
- 5+ years of data engineering experience
- Practical knowledge of engineering best practices with a strong emphasis on system robustness and maintainability
- Hands-on experience in troubleshooting systems and pipelines for performance and scaling
- Strong communication and collaboration skills to interact effectively within and across teams
- Ability to produce clear, well-documented technical designs and articulate ideas to both technical and non-technical stakeholders
- Bachelor's degree or higher in Computer Science or related field (not explicitly mentioned but implied)
- Travel requirements not mentioned
- Visa requirements not mentioned
- Clearance requirements not mentioned