Helping clients find answers in their data to impact important missions, such as fraud detection, cancer research, and national intelligence, by organizing data from disparate sources
Requirements
- 2+ years of experience in data engineering, including architecture and modeling
- 2+ years of experience designing data pipelines with Python for retrieving, parsing, and processing structured and unstructured data
- 2+ years of experience developing scalable ETL or ELT workflows for reporting and analytics
- Experience with a cloud platform such as AWS, Microsoft Azure, or Google Cloud
- Experience with distributed data or computing tools, including Spark, Databricks, Hadoop, Hive, AWS EMR, or Kafka
- Experience with Operational Data Stores (ODS) in database administration, including tuning, indexing, and access controls
- Experience with AWS Glue and AWS EMR
Responsibilities
- Deploy and develop pipelines and platforms that organize and make disparate data meaningful
- Manage the assessment, design, building, and maintenance of scalable platforms for clients
- Use experience in analytical exploration and data examination to guide a multi-disciplinary team of analysts, data engineers, developers, and data consumers
- Design data pipelines with Python for retrieving, parsing, and processing structured and unstructured data
- Develop scalable ETL or ELT workflows for reporting and analytics
- Work with distributed data or computing tools, including Spark, Databricks, Hadoop, Hive, AWS EMR, or Kafka
- Work with Operational Data Stores (ODS) in database administration, including tuning, indexing, and access controls
Other
- Ability to obtain and maintain a Public Trust or Suitability/Fitness determination based on client requirements
- Bachelor's degree
- Ability to work in a fast-paced, agile environment
- Ability to work with and guide a multi-disciplinary team of analysts, data engineers, developers, and data consumers
- Must be willing to be on camera during interviews and assessments