The Senior Data Scientist II analyzes complex structured and unstructured data using state-of-the-art data science methods for data driven decision making. Develop algorithms that enable machines to perform tasks that typically require human intelligence. Moreover, this role uses both knowledge of data science and Artificial Intelligence methods and applies them to solve real world problems.
Requirements
- Proficient in machine learning, data mining, statistical analysis, and applied AI methods using Databricks, Azure ML, and Dataiku.
- Advanced proficiency in Python, PySpark, SQL, and tools such as Jupyter, VS Code, and MLflow.
- Experience with database technologies and architectures including Delta Lake, Azure Data Lake, SQL Warehouses, and Synapse Analytics.
- Hands-on experience with AutoML platforms such as Dataiku, and familiarity with Azure AutoML.
- Deep understanding of Azure Cloud resources, including Azure Machine Learning, Azure DevOps, Azure Cognitive Search, and Azure OpenAI.
- Familiarity with Generative AI solutions, Large Language Models, and NLP frameworks like Hugging Face Transformers.
- 5 or more years of experience in Data Science, Information Systems, Computer Science, Engineering, or other field with relevant experience.
Responsibilities
- Apply advanced data science concepts to deliver data-driven digital offerings and insights using Databricks Lakehouse architecture.
- Utilize modern machine learning methods and domain understanding to support the creation of new products and services, leveraging MLflow for experiment tracking and model lifecycle management.
- Write independent source code in Python, PySpark, and SQL, validate and test models, and use Databricks Feature Store for consistent feature reuse and governance.
- Design and implement robust data architectures using Delta Lake and manage data assets securely via Unity Catalog and Azure Data Lake.
- Combine Agile methodologies with data science practices to build advanced analytics and AI products using Databricks Workflows and Azure ML Pipelines.
- Develop, test, deploy, and maintain machine learning and AI models using Databricks Runtime for ML, ensuring scalability, performance, and governance.
- Lead the data-driven decision-making process, from data collection and analysis to implementation and monitoring of solutions using Databricks Jobs, CI/CD pipelines, and Azure DevOps.
Other
- The Senior Data Scientist II mentors junior team members, leads development of data products, communicates complex solutions effectively, and guides decision-making within the organization.
- Collaborate with data and analytics teams and cross-functional departments such as digital, services, class, and engineering to build scalable ML solutions and deliver actionable insights.
- Mentor data scientists, ASPIRES, and interns, providing guidance and support in their professional development and technical growth.
- Evaluate and partner with external customers, vendors, university relations, and other teams to drive innovation and collaboration.
- Stay current in the field of AI and advanced analytics, with a focus on innovations within the Databricks, Azure, and OpenAI ecosystems, including LLMs, GenAI, and MLOps.