Moondream logo
← Back to all positions

Research Engineer

San Francisco, CA·In-person·Full-time

Most vision models are either accurate or fast. We need ours to be both, because people are running them on every frame of video in realtime. That constraint makes the work genuinely hard in a way that “scale up the cluster” doesn't solve.

We don't separate research from engineering. You design the architecture, you train it, you write the kernel that makes it actually run, you look at the failures and figure out why. The person closest to the problem makes the decision. If that sounds exhausting, it probably isn't for you. If it sounds like freedom, keep reading.

You should probably apply if:

  • You've trained models from scratch and have opinions about why your architecture choices worked (or didn't)
  • You can read a paper at breakfast and have a working implementation by dinner
  • You reach for CUDA or Triton when PyTorch is leaving performance on the table, and it doesn't feel like a big deal
  • You'd rather ship a model that thousands of developers deploy than publish a paper that dozens of researchers cite
  • You have good taste about what experiments to run next, not just the ability to run them

You should definitely not apply if:

  • You want to write papers and hand off implementation to an engineering team
  • You need someone to tell you what experiment to run next
  • Remote work is non-negotiable (we're in-person in San Francisco for a reason)

Details

  • Location: San Francisco, CA (in-person)
  • Stack: Python, PyTorch, CUDA, and whatever else the problem demands
  • Compensation: $250k–$350k + meaningful equity
  • Benefits: Health, dental, vision, paid parental leave, relocation support

Apply

Send your resume to hiring@moondream.ai