Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Eclipse Labs Logo

Data Scientist (AI Data & LLM Specialist)

Eclipse Labs

Salary not specified
Nov 7, 2025
Remote, US
Apply Now

Eclipse is building an AI agent-first marketplace that connects intelligence with real-world tasks, starting with data collection and labeling. The company is seeking a Data Scientist to establish the foundation for how their data is labeled, processed, and prepared for consumption by next-generation Large Language Models (LLMs), transforming raw data collections into valuable, AI-ready datasets.

Requirements

  • Strong understanding of data labeling methodologies and hands-on experience with data annotation platforms and workflows.
  • Demonstrated experience preparing datasets for training and fine-tuning Large Language Models (LLMs), including knowledge of techniques like tokenization, embeddings, and NER.
  • Proficiency in Python and common data science libraries (e.g., Pandas, NumPy, Scikit-learn, spaCy, Hugging Face).
  • Experience using APIs/SDKs to automate data annotation and active learning loops.
  • Experience with audio data processing and relevant libraries.
  • Familiarity with data annotation platforms and tools.
  • Knowledge of modern MLOps principles and practices.

Responsibilities

  • Develop Data Labeling Strategies: Design and document a formal data annotation strategy, including clear, scalable, and efficient guidelines for labeling our data.
  • Define and enforce quality metrics, including inter-annotator agreement.
  • Optimize for LLM Consumption: Research, define, and prototype the optimal data formats, structures, and pre-processing steps required for fine-tuning and training LLMs on our datasets.
  • Data Quality Analysis: Establish automated processes and metrics to analyze the quality of both raw and labeled data, providing feedback to improve our data collection and labeling workflows.
  • Collaborate with Engineering: Work closely with the engineering team to guide the implementation of data processing pipelines and ensure the data infrastructure meets the needs of ML applications.

Other

  • Excellent communication skills, with an ability to create clear documentation for technical and non-technical audiences.
  • Flexibility. We collaborate synchronously and asynchronously, across weekly all-hands meetings, Slack messaging, and quarterly in-person meetups
  • Culture. As an early member of our team, you’ll have a unique opportunity to help shape our culture. We value intellectual honesty, bias towards action, and believe every member plays a key role in achieving our ambitious goals
  • Compensation. You’ll receive a competitive salary + equity + benefits package.
  • Eclipse Laboratories is an equal opportunity employer.