TigerGraph is looking for a Data Engineer to contribute to building a distributed and reliable framework for streaming data from various sources into TigerGraph's graph database, enabling advanced analytics and machine learning on connected data for Fortune 500 organizations.
Requirements
- Solid programming fundamentals; experienced with Java, Go, or any other major programming language.
- Understanding of distributed systems principles and the ability to evaluate trade-offs in system design.
- Familiar with Kafka or similar streaming technologies; knowledge of Kafka Connect, Flink, or Spark Streaming is a plus.
- Capable of identifying and addressing performance bottlenecks related to serialization, buffering, and data flow in streaming systems.
- Proficient with Linux command-line tools and shell scripting for effective debugging and development workflows.
- Open to adopting AI-assisted engineering practices ("vibe coding") to improve productivity and code quality.
- Experience with data lakehouse technologies such as Apache Iceberg, Delta Lake, or Hudi.
Responsibilities
- Contribute to building a distributed and reliable framework for streaming data from various sources into TigerGraph's graph database.
- Develop efficient data pipelines for ingesting and pre-processing structured and semi-structured data.
- Build lightweight tools and services to monitor and visualize data ingestion and processing flows.
- Stay informed about evolving trends in data science and apply them to streaming infrastructure and architecture.
Other
- This position is primarily remote, but location-based requirements may apply. If the selected candidate is located near one of our company offices, the candidate will have a hybrid work arrangement (2-3 days in-office).
- Proactive and collaborative team player with strong communication skills.
- Bachelor's degree in Computer Science or a related field; 1-3 years of relevant experience preferred.