Research Engineers & Research Scientists

Full Time

|

San Francisco, CA

|

PerfectBit

About the Team

  • Peter Vajda — Former Director of Media Generation at Meta Superintelligence Labs, where he led the foundation models behind Movie Gen and Emu. Previously served as a Visiting Assistant Professor at Stanford. Holds a PhD in Computer Science.

  • Seiji Yamamoto — Former Senior Staff Research Scientist at Meta Superintelligence Labs, where he led teams in the Core Llama group across LLM pre-training, post-training, inference, speech, and vision. Holds a PhD in Physics.

About the Company

PerfectBit is a new kind of data company built to solve the biggest bottleneck in physical AI: data.

The real-world interaction data that robots and world models need doesn't exist on the internet—it has to be created. Most data companies address this challenge by scaling networks of human experts. We take a different approach.

We use applied AI to develop novel solutions to the data problems slowing down robotics, world models, and physical AI. Our quality and scale come from better methods—not simply more headcount.

Who We're Seeking

We're looking for candidates who:

  • Demonstrate high agency and autonomy—you identify opportunities, define the scope, and deliver results.

  • Have built and shipped real-world systems across data, machine learning, or infrastructure, rather than only prototypes or research projects.

  • Care deeply about outcomes and the mission, not performance reviews.

  • Thrive in a small, highly collaborative team environment.

  • Effectively leverage AI coding tools such as Claude and Codex to accelerate development and execution.

Preferred Qualifications

Strong candidates may also have:

  • A track record of published research at leading conferences such as NeurIPS, ICLR, CVPR, ICCV, ICRA, or CoRL.

  • Deep expertise in one or more of the following areas:

    • Reinforcement learning

    • Robotics

    • Multimodal AI, including LLMs, VLMs, VLAs, world models, WAMs, diffusion models, and flow matching

    • Synthetic data generation

    • Simulation platforms such as Isaac, MuJoCo/MJX, or Genesis

    • 3D graphics and reconstruction, including Blender, CAD, Gaussian Splatting, and NeRF

    • Teleoperation and data collection hardware

    • Evaluation frameworks (evals)

    • AI infrastructure

    • Model pre-training and post-training

    • AI agents