Research Engineer

San Francisco, CA·In-person·Full-time

Most vision models are either accurate or fast. We need ours to be both, because people are running them on every frame of video in realtime. That constraint makes the work genuinely hard in a way that “scale up the cluster” doesn't solve.

We don't separate research from engineering. You design the architecture, you train it, you write the kernel that makes it actually run, you look at the failures and figure out why. The person closest to the problem makes the decision. If that sounds exhausting, it probably isn't for you. If it sounds like freedom, keep reading.

You should probably apply if:

You've trained models from scratch and have opinions about why your architecture choices worked (or didn't)
You can read a paper at breakfast and have a working implementation by dinner
You reach for CUDA or Triton when PyTorch is leaving performance on the table, and it doesn't feel like a big deal
You'd rather ship a model that thousands of developers deploy than publish a paper that dozens of researchers cite
You have good taste about what experiments to run next, not just the ability to run them

You should definitely not apply if:

You want to write papers and hand off implementation to an engineering team
You need someone to tell you what experiment to run next
Remote work is non-negotiable (we're in-person in San Francisco for a reason)

Details

Location: San Francisco, CA (in-person)
Stack: Python, PyTorch, CUDA, and whatever else the problem demands
Compensation: $250k–$350k + meaningful equity
Benefits: Health, dental, vision, paid parental leave, relocation support

Apply

Send your resume to hiring@moondream.ai

Expect to hear back within a week. If you don't, please follow up!