Ohalo is seeking to build and maintain data pipelines to support machine learning engineering activities and manage data related to plant phenotypes and genotypes.
Requirements
- Proficiency with cloud platforms, preferably GCP (or AWS).
- Strong experience with data processing frameworks such as BigQuery, Spark, and Nextflow (a plus).
- Proficiency in Python and frameworks like FastAPI.
- Experience with service-oriented and event-driven architectures.
- Knowledge of data streaming and messaging systems, including Pub/Sub and Kafka.
- Strong problem-solving skills and attention to detail.
- Experience with automation engineering and robotics systems.
Responsibilities
- Design and implement robust data architectures.
- Build and maintain scalable data pipelines using technologies such as GCP (or AWS), BigQuery, Python, and Spark.
- Develop and manage service-oriented and event-driven architectures, utilizing tools like Pub/Sub and Kafka.
- Collaborate with machine learning engineers on model development and deployment.
- Work closely with automation engineering to automate data collection from robotics systems.
- Ensure data integrity and security across all pipelines and processes.
- Optimize data workflows and storage solutions for performance and scalability.
Other
- Bachelor's degree in a technical field (e.g., Computer Science, Engineering, Information Technology).
- Minimum of 5 years of experience in a similar role.
- Ability to work independently and as part of a geographically distributed team.
- Excellent communication and collaboration skills.
- No visa sponsorship is available for this position at this time.