TP-Link Systems Inc. is looking to solve the problem of building and maintaining scalable data pipelines to enhance people's lives through faster, more reliable connectivity.
Requirements
- Experience requirements: 5+ years in data engineering, software engineering, or data infrastructure with proven experience building and operating large scale data pipelines and distributed systems in production, including terabyte scale big data environments.
- Programming proficiency: Strong Python skills for building data pipelines and processing jobs, with ability to write clean, maintainable, and efficient code.
- Distributed systems expertise: Deep knowledge of distributed systems and parallel processing concepts.
- Big data frameworks: Strong proficiency in big data processing frameworks such as Apache Spark for batch processing and other relevant batch processing technologies.
- Database and data warehouse expertise: Strong understanding of relational database concepts and data warehouse principles.
- Workflow Orchestration: Hands-on experience with data workflow orchestration tools like Apache Airflow or AWS Step Functions for scheduling, coordinating, and monitoring complex data pipelines.
- Problem solving and collaboration: Excellent problem solving skills with strong attention to detail and ability to work effectively in collaborative team environments.
Responsibilities
- Design and build scalable data pipeline: Develop and maintain high performance and large scale data ingestion and transformation, including ETL/ELT processes, data de-identification, and security management.
- Data orchestration and automation: Develop and manage automated data workflows using tools like Apache Airflow to schedule pipelines, manage dependencies, and ensure reliable, timely data processing and availability.
- AWS integration and cloud expertise: Build data pipelines integrated with AWS cloud-native storage and compute services, leveraging scalable cloud infrastructure for data processing.
- Monitoring and data quality: Implement comprehensive monitoring, logging, and alerting to ensure high availability, fault tolerance and data quality through self healing strategies and robust data validation processes.
- Technology innovation: Stay current with emerging big data technologies and industry trends, recommending and implementing new tools and approaches to continuously improve data infrastructure.
- Technical leadership: Provide technical leadership for data infrastructure teams, guide architecture decisions and system design best practices.
- Mentor junior engineers through code reviews and knowledge sharing, lead complex projects from concept to production, and help to foster a culture of operational excellence.
Other
- Bachelor's degree in Computer Science or related field
- 5+ years of experience in data engineering, software engineering, or data infrastructure
- Excellent problem solving skills with strong attention to detail
- Ability to work effectively in collaborative team environments
- No visa sponsorships available