OCC's data platform needs a semantic layer to simplify complex data, eliminate redundant definitions, create query-friendly datasets, and standardize column naming for downstream users developing quantitative analytics, dashboards, and internal risk applications. This role will also address strategic data challenges and evolve analytics platforms to meet evolving needs while adhering to security and IT standards.
Requirements
- Ability to write and optimize complex analytical (SELECT) SQL queries
- Ability to write and optimize python for custom data pipeline code (virtual environments, scripts vs. modules vs packages, functional programming, unit testing)
- Experience with a source code version control repository system, branch management, pull requests (preferably Git)
- Experience with data viz/prep tools (preferably Tableau and Alteryx)
- Experience with transformation/semantic layer frameworks, such as dbt
- Familiarity with services on at least one cloud computing platform, such as AWS or Azure, or a cloud data platform such as Databricks or Snowflake
- Familiarity with data modeling design concepts such as 3rd-normal form or denormalization modeling concepts such as star-schema
Responsibilities
- Support the design and implementation of cloud infrastructure for internal analytics zone in collaboration with OCC’s Data Platform team, data architects, DevOps, IT
- Assist in the build, test, and deploy semantic layer’s virtual and physical data models that simplify complex semi-structured data, eliminate multiple definitions of similar data, create query-friendly datasets, and standardize column naming for downstream users that are developing quantitative analytics, dashboards, and internal risk applications.
- Assist in maintaining performance and accuracy SLAs for semantic layer and other data products through observability practices, ensuring proactive detection of system failures and incident response
- Work with upstream data producers to understand how their systems work, how they generate data, and how that is subject to change over time to help manage schema drift
- Collaborate with Data Governance, Data Platform Team, and DBAs to design access controls to data platform that meet business and internal governance needs
- Create documentation and testing to ensure data lineage is traceable and semantic layer components are easily discoverable and useful to business users
- Support the implementation of ETL and data serving solutions for large datasets generated by our risk models that meet critical business user SLAs around latency and access patterns
Other
- Ability to collaborate with multiple partners (e.g. Business Users, Data and Solution Architects, Data Governance and IT teams -- Data Platform Team, Systems & Infrastructure, Security, DevOps, Networking) to craft solutions that align business goals with internal processes, security, and delivery standards in mind.
- Ability to communicate technical concepts to audiences with varying levels of technical background and synthesize non-technical requests into technical output
- Comfortable supporting business analysts on high-priority projects
- High attention to detail, tradeoffs, and an ability to think structurally about a solution
- 3+ years of experience as a data engineer, software engineer, data scientist, financial risk analyst, business intelligence analyst