Checkr is building the data platform to power safe and fair decisions. The Senior Data Engineer will contribute to the development and maintenance of systems that conduct accurate and efficient background checks, supporting Checkr's mission of enabling safe and trusted workplaces and communities.
Requirements
- 7+ years of development experience in the field of data engineering (5+ years writing PySpark).
- Experience building large-scale (100s of Terabytes and Petabytes) data processing pipelines - batch and stream.
- Experience with ETL/ELT, stream and batch processing of data at scale.
- Strong proficiency in PySpark and Python.
- Expertise in understanding of database systems, data modeling, relational databases, NoSQL (such as MongoDB).
- Experience with big data technologies such as Kafka, Spark, Iceberg, Datalake and AWS stack (EKS, EMR, Serverless, Glue, Athena, S3, etc.).
- Knowledge of security best practices and data privacy concerns.
Responsibilities
- Create and maintain data pipelines and foundational datasets to support product/business needs.
- Design and build database architectures with massive and complex data, balancing with computational load and cost.
- Develop audits for data quality at scale, implementing alerting as necessary.
- Create scalable dashboards and reports to support business objectives and enable data-driven decision-making.
- Troubleshoot and resolve complex issues in production environments.
- Work closely with product managers and other stakeholders to define and implement new features.
Other
- Work in a hybrid work environment with expectations to work from the office 2 to 3 days a week in Denver, CO, San Francisco, CA, or Santiago, Chile.
- Collaborate in an international company based in the United States.
- Travel may be required for in-office work and collaboration.
- Work in a fast-moving and collaborative environment.
- Comply with Checkr's commitment to diversity and inclusion, including welcoming applicants with prior arrest or conviction records.