Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Research Scientist 5 - Vision-Language-Action Models - Autonomous Systems

Robert Bosch Venture Capital

Salary not specified

Sep 24, 2025

Sunnyvale, CA, USA

Conduct research and engineering in core AI and machine learning fields to enable Embodied AI for Bosch's AIoT (AI+IoT) business domains of autonomous driving, industrial automation, robotics etc.

Requirements

Proficiency in one or more programming languages commonly used in machine learning (e.g., Python, C++, Rust).
Knowledge of major machine learning frameworks like TensorFlow or PyTorch.
Hands-on experience building and applying multimodal transformer-based sequence-to-sequence models.
Familiarity with concepts in vision-language-action models like MoE, GRPO, LoRA, etc.
Hands-on experience in computer vision and deep learning, with work in any of the following areas: multimodal transformers, multimodal language models, diffusion models, NeRF, gaussian splatting, object detection / segmentation, 3D scene understanding, sensor calibration, SfM, voxel/BEV grid-based feature representation.

Responsibilities

Conduct research and engineering in core AI and machine learning fields to enable Embodied AI (including computer vision, autonomous planning, open-world learning, and so on) for related AIoT (AI+IoT) business domains of autonomous driving, industrial automation, robotics etc.
Push the boundaries in (modular) end-to-end perception and planning for automated driving, incorporating advancements in large vision-language-(action) models to aid reasoning capabilities and explainability.
Implement research results to solve real-world challenges, ensuring high-quality system integration within Bosch's existing platforms.
Document and disseminate research findings through high-caliber publications and/or patent submissions.
Hands-on experience building and applying multimodal transformer-based sequence-to-sequence models.
Familiarity with concepts in vision-language-action models like MoE, GRPO, LoRA, etc.
Hands-on experience in computer vision and deep learning, with work in any of the following areas: multimodal transformers, multimodal language models, diffusion models, NeRF, gaussian splatting, object detection / segmentation, 3D scene understanding, sensor calibration, SfM, voxel/BEV grid-based feature representation.

Other

Collaborate with a global team to transfer cutting-edge research findings to Bosch's operational units.
Stay abreast of the latest technological advancements and market trends by attending academic conferences, technical events, and seminars.
Ph.D. in Computer Science, Robotics or a related discipline or Master’s degree with >= 1 / 3 years industry experience after graduation.
A minimum of 3 years of R&D experience, or an equivalent graduate research background, primarily in AI technologies including Computer Vision and Robotic or Automotive Motion and Behavioral Planning.
Strong interpersonal, communication, and teamwork capabilities.
Experience with real-world product development and deployment of autonomous systems.
A strong portfolio of publications in premier machine learning, deep learning, robotics and computer vision journals and conferences.