Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Centific Logo

Vision Research Intern-1

Centific

$30 - $50
Sep 23, 2025
Seattle, WA, USA
Apply Now

Centific is looking to solve the problem of translating cutting-edge research in computer vision, multimodal large models, and embodied/physical AI into production systems that can perceive, reason, and act in the real world, thereby empowering enterprise clients with safe and scalable AI deployment.

Requirements

  • Strong PyTorch (or JAX) and Python; comfort with CUDA profiling and mixed‑precision training.
  • Demonstrated research in computer vision and at least one of: VLMs (e.g., LLaVA‑style, video‑language models), embodied/physical AI, 3D perception.
  • Proven ability to move from paper → code → ablation → result with rigorous experiment tracking.
  • Experience with video models (e.g., TimeSFormer/MViT/VideoMAE), diffusion or 3D GS/NeRF pipelines, or SLAM/scene reconstruction.
  • Prior work on multimodal grounding (referring expressions, spatial language, affordances) or temporal reasoning.
  • Familiarity with ROS2, DeepStream/TAO, or edge inference optimizations (TensorRT, ONNX).
  • Scalable training: Ray, distributed data loaders, sharded checkpoints.

Responsibilities

  • Build and fine‑tune models for detection, tracking, segmentation (2D/3D), pose & activity recognition, and scene understanding (incl. 360° and multi‑view).
  • Train/evaluate vision–language models (VLMs) for grounding, dense captioning, temporal QA, and tool‑use; design retrieval‑augmented and agentic loops for perception‑action tasks.
  • Prototype perception‑in‑the‑loop policies that close the gap from pixels to actions (simulation + real data). Integrate with planners and task graphs for manipulation, navigation, or safety workflows.
  • Curate datasets, author high‑signal evaluation protocols/KPIs, and run ablations that make results irreproducible impossible.
  • Package research into reliable services on a modern stack (Kubernetes, Docker, Ray, FastAPI), with profiling, telemetry, and CI for reproducible science.
  • Orchestrate multi‑agent pipelines (e.g., LangGraph‑style graphs) that combine perception, reasoning, simulation, and code‑generation to self‑check and self‑correct.

Other

  • Ph.D. student in CS/EE/Robotics (or related), actively publishing in CV/ML/Robotics (e.g., CVPR/ICCV/ECCV, NeurIPS/ICML/ICLR, CoRL/RSS).
  • Public code artifacts (GitHub) and first‑author publications or strong open‑source impact.
  • A publishable or open‑sourced outcome (with company approval) or a production‑ready module that measurably moves a product KPI (latency, accuracy, robustness).
  • Clean, reproducible code with documented ablations and an evaluation report that a teammate can rerun end‑to‑end.
  • A demo that clearly communicates capabilities, limits, and next steps.