Socure is looking to solve the problem of verifying 100% of good identities in real time and eliminating identity fraud from the internet by building data platforms and real-time systems that process large-scale device and behavioral data.
Requirements
- 5+ years of experience building and maintaining large-scale data systems and pipelines in cloud environments.
- Strong programming experience in Python, Go, or Java.
- Proficiency in SQL.
- Understanding of data modeling, ETL/ELT processes, and data warehousing.
- Deep experience with distributed processing frameworks such as Spark, Kafka, and Flink.
- Hands-on experience with AWS services including Glue, Kinesis, EMR, Lambda, and DynamoDB.
- Experience with orchestration tools such as Airflow or AWS Step Functions.
Responsibilities
- Design, build, and maintain scalable data pipelines to collect, clean, transform, and aggregate large-scale device and behavioral signals.
- Build and support real-time velocity systems that compute session-level and user-level aggregations with sub-second latency.
- Develop and manage data infrastructure such as data lakes and data warehouses.
- Optimize data architectures for performance, scalability, cost efficiency, and maintainability.
- Collaborate with ML engineers, data scientists, and product stakeholders to ensure data availability, reliability, and privacy compliance.
- Maintain and enhance orchestration workflows; automate workflows and monitoring systems.
- Ensure data quality, consistency, and security across systems; implement best practices in data modeling, schema evolution, and pipeline observability.
Other
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related technical field.
- Strong problem-solving and communication skills.
- Experience working in fraud detection, risk scoring, or identity verification domains.
- Strong system design and infrastructure optimization skills in high-scale environments.