USDS is looking for a Site Reliability Engineer to focus on the data pipeline reliability for the Video Platform team, ensuring users have the freshest, complete, and correct data possible.
Requirements
- Good programming experience with SQL and at least one of the following languages: Java, Python, Go, or Scala
- Experience in data engineering, with a focus on data systems reliability, scalability, performance and capacity management
- Solid experience with big data technologies (e.g., Hadoop, Spark, Flink, YARN) and databases (SQL, NoSQL)
- Knowledge of data pipeline and workflow management tools (e.g., Airflow, Luigi)
- Experience in building data solutions with AWS, Google, Azure and other cloud services is a plus
- Demonstrated independent thinking capabilities and troubleshooting skills in large scale distributed systems
Responsibilities
- Manage day-to-day operations of data service, realtime/batch data pipelines, such as Service Level Agreement management, pipeline deployment, performance tuning and troubleshooting
- Proactively monitor and troubleshoot data pipelines and systems for performance issues, errors, or anomalies
- Create tools, build alarms and dashboards, drive internal process improvements, and automation to monitor and improve data engineering operations
- Improve systems reliability, efficiency, and velocity through scaling, optimization of both resources and data processing workflows, potentially refactoring code or implementing new solutions
- Develop and deploy new reliable and scalable data pipelines and infrastructure components as required by business needs
- Work closely with data engineering and various vertical teams within the Video Architecture platform
Other
- This is a Site Reliability Engineer role, focusing on the data pipeline reliability for the Video Platform team in USDS.
- In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department.
- Good communication and coordination skills
- This role requires the ability to work with and support systems designed to protect sensitive data and information.
- As such, this role will be subject to strict national security-related screening.