Engineering Internship, Enrichment and Curation

About the Role

Our team is seeking a talented Engineering Intern to join us for 3-6 months and propel our ambitious research in embodied foundation models forward.

We’re a team of Applied Scientists, Machine Learning Engineers, and Software Engineers who strive to expand the horizons of embodied AI beyond simply reacting to perceptual inputs toward reasoning over them to handle even the most complex and rare situations. Our projects encompass some of the hardest problems in AI and require leveraging the latest research, state-of-the-art models, rigorous engineering, and cross-functional collaboration.

In this role, you might:

Work on foundation models for embodied AI, including large-scale pretraining, post-training, leveraging language, or improving reasoning capabilities.
Train models on large-scale multimodal (vision, language, etc.) data efficiently in a multi-node distributed system, and evaluate their performance on open (and closed) datasets/benchmarks.
Curate large multimodal datasets for training and evaluation.
Lead a high-impact research work and publish at a top tier conference (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others).

You’d be a great match for this role if:

You have previous experience in vision-language models, large language models, natural language processing, especially around reasoning.
You have prior experience in curating training data to steer the behavior of trained models.
You have solid software engineering fundamentals, especially in Python
You have previously used PyTorch or a similar library for deep learning (e.g. Tensorflow, JAX).
Experience with multi-node distributed training of large models.
You are interested in using large-scale multimodal (vision, language, etc.) datasets to improve embodied AI.
You have previous publications in the following conferences (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others).

Essentials:

You are currently pursuing a graduate degree in a Computer Science, Machine Learning, Robotics, or related technical field.
You are proficient in at least one backend/systems programming language (e.g. Python, Ruby, Java, etc).

We understand that everyone has a unique set of skills and experiences and that not everyone will meet all of the requirements listed above. If you’re passionate about self-driving cars and think you have what it takes to make a positive impact on the world, we encourage you to apply.

This role is a full-time role based in Sunnyvale, CA (hybrid) and the reasonably estimated hourly rate for this role is $99.76/hour.