Protagonist is looking to solve the problem of handling large amounts of data and developing cutting-edge data-driven insights for their clients.
Requirements
- 1-3 years experience in software development using Python
- 1-3 years experience with Amazon Web Services (AWS); AWS Certification a plus
- Strong proficiency with container orchestration (Docker, Kubernetes)
- Strong proficiency with big data technologies (Hadoop, Hive, PySpark)
- Strong proficiency with database systems (PostgreSQL)
- Strong proficiency with workflow orchestration (Argo Workflows, Airflow)
- Strong proficiency with messaging systems (Kafka)
Responsibilities
- Design, develop, and maintain data pipelines using Argo Workflows on Kubernetes for data enrichment workflows
- Implement PySpark and Hadoop-based processing for large-scale data operations
- Build and manage data ingestion capabilities from multiple third-party sources into the GEN-5 platform
- Implement and optimize data workflows across PostgreSQL, Hive, and Kafka ecosystems
- Architect and deploy AWS infrastructure to support data processing at scale, including transition to GovCloud
- Develop and maintain Python applications that support Boolean query refinement, data discovery, and analysis dashboards
- Create reusable pipeline templates that enable consistent processing of similar data types
Other
- Must be able to work on U.S. Government contracts restricted to U.S. citizens
- Must be eligible to obtain a U.S. Government security clearance
- BS in Computer Science, Computer Engineering, or a related field
- Ability to work independently while taking initiative
- Experience identifying opportunities for automation and optimization
- Strong collaboration skills with cross-functional teams