Coupang is looking to build the future of commerce by solving problems and breaking traditional tradeoffs, and needs an Observability Engineer to help maintain and improve the quality of their services.
Requirements
- Strong experience in implementing and managing observability solutions in large-scale, complex environments.
- Deep knowledge of monitoring, alerting, and logging systems and tools, such as Prometheus, Grafana, Elastic Stack, Datadog, or New Relic.
- Familiarity with distributed tracing technologies, such as Jaeger or Zipkin.
- Experience with cloud-based infrastructure, including AWS, Azure, or Google Cloud Platform.
- Strong understanding of DevOps and SRE practices, including continuous integration, continuous delivery, and infrastructure as code (IaC).
- Proficiency in scripting languages, such as Python, Bash, or Ruby.
- Experience with containerization and orchestration technologies, such as Docker and Kubernetes.
Responsibilities
- Design, implement, and maintain observability solutions such as monitoring, alerting, logging, and tracing across various platforms, applications, and infrastructure.
- Collaborate with cross-functional teams to identify and define observability requirements.
- Develop and implement best practices for creating and maintaining effective monitoring, alerting, and telemetry systems.
- Evaluate and recommend industry-leading observability tools and technologies to improve system visibility and reliability.
- Define and track key performance indicators (KPIs) and service-level objectives (SLOs) related to system availability, performance, and reliability.
- Assist in the troubleshooting and resolution of complex incidents and problems by analyzing data from observability tools.
- Provide guidance and mentorship to other engineers on observability principles, practices, and tools.
Other
- Bachelor's Degree in Computer Science, Engineering, or a related technical field.
- Excellent communication and collaboration skills, with the ability to work with teams across different functions and technical domains.
- Strong problem-solving and analytical skills, with a focus on data-driven decision-making.
- A proven track record of leading and delivering successful observability projects and initiatives.
- Medical/Dental/Vision/Life, AD&D insurance
- Flexible Spending Accounts (FSA) & Health Savings Account (HSA)
- Long-term/Short-term Disability
- Employee Assistance Program (EAP) program
- 401K Plan with Company Match
- 18-21 days of the Paid Time Off (PTO) a year based on the tenure
- 12 Public Holidays
- Paid Parental leave
- Pre-tax commuter benefits
- MTV - [Free] Electric Car Charging Station