Camunda is looking to enhance its next-generation analytics product for process orchestration by building reliable, high-throughput SaaS capabilities that transform large data flows into actionable insights. The goal is to measure, improve, and scale performance and fault tolerance in real-world distributed systems.
Requirements
- 3+ years as a full‑stack Software Engineer with strong production experience in Java and React.
- Curiosity and resilience in distributed systems—eager to experiment, learn daily, and “break things on purpose” to improve robustness.
- Interest and practical intuition for building high‑throughput SaaS platforms that process large, continuous data volumes.
- Drive to design and build an enterprise‑grade analytics product—from data ingestion through aggregation to clear, insightful UIs.
- Experience with high‑throughput, low‑latency distributed SaaS products in production.
- Hands‑on work with distributed data stores and practical trade‑offs related to the CAP theorem.
- Opensearch/Elasticsearch experience including performance tuning
Responsibilities
- Collaborate with reliability experts to measure, quantify, and clearly articulate the load, throughput, and performance characteristics of our analytics platform (Optimize), and translate findings into action.
- Propose an iterative roadmap for performance and fault‑tolerance improvements based on metrics you’ve developed and a deep understanding of the product architecture and codebase.
- Implement, validate, and roll out performance improvements in stages—profiling, tuning, and verifying impacts in production‑like conditions.
- Design and build backend APIs (Java) and frontend dashboards (React) to retrieve, aggregate, and visualize analytics data from distributed data stores.
- Create data exporters and connectors that enable third‑party BI tools to pull detailed analytics from multiple distributed instances, safely and efficiently.
- Raise the engineering bar by contributing to code quality, test automation, observability, and collaborative design reviews.
Other
- Ability and/or willingness to use our product
- Pragmatic, autonomous problem‑solver focused on measurable outcomes and continuous improvement, with a keen eye on performance and fault tolerance.
- Willingness and ability to use Camunda products and approach reliability from a user’s perspective
- Exposure to modern cloud operations (e.g., Kubernetes, Operators, Helm charts), including monitoring and troubleshooting in production.
- Experience with load testing and iterative tuning to improve reliability, throughput, and overall system performance.