Epsilon Data Management LLC is looking for a Senior Software Engineer to develop and maintain a custom cloud-based job scheduling platform, ensuring its resilience, security, and efficiency for client teams.
Requirements
- Develop and enhance new features while ensuring backward compatibility, resolving issues through Python/Java programming, executing tests in a Kubernetes cluster, and utilizing debugging expertise.
- Conduct research to identify potential causes of technical issues and propose strategic solutions for resolution.
- Implement new features and optimize existing product functionalities using Python/Java programming, multithreading, HTML, web frameworks, SQL, persistence frameworks, Docker, API testing, prototyping, Kafka, and unit/integration testing to deliver efficient solutions.
- Analyze and address client-reported issues through code analysis, simulation testing, and log tracing.
- Design/review and utilize Prometheus metrics within the Grafana dashboard to enable real-time monitoring and issue diagnosis.
- Familiar with core AWS services, including Elastic Compute Cloud (EC2), Lambda, Simple Storage Service (S3), Elastic Container Registry (ECR), and Relational Database Service (RDS), among others.
- Resolve all errors and bugs that come across when creating new features.
Responsibilities
- Research, design, and develop computer and network software or specialized utility programs.
- Provide a custom cloud-based job scheduling platform to the teams.
- Create new features for the platform and incorporate with Airflow new releases.
- Test and debug inside k8s orchestration systems to resolve bugs for new features and clients’ issue report.
- Work closely with other client teams to ensure that the platform is resilient, and secure.
- Resolve bugs and issues and perform deep analysis on team codebase and airflow code, log stack in airflow and k8s, metrics collected in Grafana, and other information.
- Develop new functionality for team’s job scheduling platform.
Other
- Telecommuting available from anywhere in the US.
- Communicate with other team regarding the problem they are facing when using our platform.
- Document new features and user guidance for our platform.
- Support PagerDuty for our platform.
- Self-study to keep on track of latest technology trends and new type of issues.