Spectrum is looking to leverage data engineering expertise to build reliable data systems and automate pipelines that support strategic decision-making, enabling the organization to thrive through actionable insights and elevate its data-driven capabilities.
Requirements
- 5+ years of hands-on working experience with RDBMS, SQL, scripting, and coding
- 3+ years of Linux/Unix/CentOS system admin
- Ability to use a wide variety of open-source technologies and cloud services and identify and resolve end-to-end performance, network, server, and platform issues
- Extensive coding/scripting experience using Python, R, shell scripts
- Extensive experience with Spark, Hadoop/Hive, SQL, Tableau, ML Pipeline techniques, and ETL techniques
- Extensive background in Linux/Unix/CentOS installation and administration; Windows experience preferred
- Extensive knowledge in data storage that demonstrates knowledge of when to use a file system, relational database, or NoSQL variant
Responsibilities
- Design and maintain scalable systems that support data operations for reporting, analytics, applications, and data science
- Gather and process raw data at scale using scripts, web scraping, APIs, SQL queries, applying ETL methods to clean and enhance datasets
- Assess data quality, integrity, accuracy, and completeness through profiling techniques
- Develop and implement tools, scripts, queries, and applications for ETL/ELT and data operations
- Design, build, and automate Machine Learning Data Pipeline
- Deliver solutions by coding, developing, and testing scripts, ensuring timely delivery and reporting
- Manage life cycle of multiple data sources, collaborating with analysts and data scientists to meet data needs
Other
- This role requires the ability to work lawfully in the U.S. without employment-based immigration sponsorship, now or in the future.
- Office environment
- Travel as required
- Ability to read, write, speak and understand English
- Effective attention to detail with the ability to effectively prioritize and execute multiple tasks