At Storable, the business problem is to power the future of storage by leveraging cutting-edge technologies to improve the efficiency, accessibility, and insights derived from data, empowering the team to make smarter decisions and foster impactful growth.
Requirements
- Proven Expertise in Data Management: 6+ years of significant experience in managing data infrastructure, data governance, and optimizing data pipelines at scale.
- Technical Proficiency: 5+ years of strong hands-on experience with data tools and platforms such as Apache Airflow, Apache Iceberg, and AWS services (S3, Lambda, Redshift, Glue, Athena).
- Data Pipeline Mastery: Familiarity with designing, implementing, and optimizing data pipelines and workflows in Python or other languages for data processing.
- 5+ years of hands-on experience with Trino/Presto and Apache Spark for distributed data processing.
- Solid understanding of data modeling, warehousing concepts, and schema design.
- Experience with Data Governance: Solid understanding of data privacy, quality control, and governance best practices.
- Bonus Points: Experience with visualization tools (e.g., Looker, Tableau) and reporting frameworks to provide actionable insights.
Responsibilities
- Oversee Data Pipelines: Design, implement, and maintain scalable data pipelines using industry-standard tools to efficiently process and manage large-scale datasets.
- Ensure Data Quality & Governance: Implement data governance policies and frameworks to ensure data accuracy, consistency, and compliance across the organization.
- ETL Development: Build, optimize, and maintain ETL pipelines for ingesting, transforming, and delivering large datasets from multiple sources.
- Workflow Orchestration: Manage and schedule complex workflows using Apache Airflow.
- Query Engines & Processing Frameworks: Leverage Trino (Presto), Apache Spark, and other
- Optimize Data Infrastructure: Leverage modern data tools and platforms (e.g., AWS, Apache Airflow, Apache Iceberg) to create an efficient, reliable, and scalable data infrastructure.
- Monitor & Improve Performance: Proactively monitor data processes and workflows, troubleshoot issues, and optimize performance to ensure high reliability and data integrity.
Other
- Leadership Skills: Ability to lead and mentor teams, influence stakeholders, and drive data initiatives across the organization.
- Analytical Mindset: Strong problem-solving abilities and a data-driven approach to improve business operations.
- Excellent Communication: Ability to communicate complex data concepts to both technical and non-technical stakeholders effectively.
- All applicants must be currently authorized to work in the United States on a full-time basis.
- Must reside in the following states: AL, AZ, CA, CO, CT, FL, GA, ID, IL, IN, IA, KS, LA, MD, MA, MI, MN, MO, MS, NC, NE, NJ, NV, NY, OH, OK, OR, PA, SC, TN, TX, UT, VA, WA, WI, WY.