The company is looking to develop novel approaches for data generation, curation, and evaluation to support emerging use cases across domains.
Requirements
Strong foundation in large language models, generative AI, or data generation techniques, especially for supervised fine-tuning and reinforcement learning
Experience developing, experimenting with, and deploying AI models and data pipelines at scale
Solid programming skills in Python
Familiarity with ML frameworks such as PyTorch, HuggingFace, etc.
Familiarity with software engineering best practices and clean coding
Past experience in data labeling, annotation, or curation projects
Knowledge of production workflows for DaaS offerings or data delivery teams
Responsibilities
Conduct research on data curation and generation to support emerging use cases across domains
Design and prototype data generation and curation pipelines that feed directly into Data as a Service offerings
Build sophisticated evaluators to measure quality in our data, including coverage, bias, and utility
Write clear, maintainable Python code to support experiments and production pipelines
Iterate rapidly on solutions based on customer feedback, emerging research, and evolving DaaS requirements
Collaborate cross-functionally with delivery managers, vendors, and engineering teams to research to production
Translate customer high-level goals into data requirements, and annotation guidelines and workflows
Other
PhD. in Computer Science or a related field with focus on data centric AI and synthetic data generation
Track record of working in fast paced, iterative environments and handling uncertainty in project requirements
Bias for action, comfortable rolling up your sleeves, experimenting, and iterating quickly to solve problems
Strong communication and collaboration skills, especially when working across research, engineering, and delivery teams
Competitive compensation range of $140,000 – $275,000 plus equity opportunities
Growth oriented environment where your work directly impacts product direction and customer success