← Back to all positions
Backend Engineer
San Francisco, CA·In-person·Full-time
We make vision models fast enough to process every frame of video in realtime. The models are the easy part to explain. The hard part is everything else: serving inference at scale, keeping tail latency sane, building APIs that developers actually want to integrate, and not letting the infrastructure rot as we ship faster.
This is a small team. You won't be writing microservices in a vacuum. On Monday you might be profiling a Rust inference server. On Wednesday you're debugging a weird Cloudflare Workers edge case. On Friday you're writing the Python pipeline that processes the training data for next week's model. If that sounds chaotic, it is. If it sounds fun, keep reading.
You should probably apply if:
- You've been on-call for something that actually mattered and you have scar tissue from it
- You pick up new languages fast because you understand what's happening underneath, not because you memorized the syntax
- You've looked at a distributed system and thought “this is too complicated” and then made it simpler
- You have opinions about when to use Postgres vs. DynamoDB vs. just putting it in S3 and calling it a day
- You've shipped things you're proud of, and you can point at them
You should definitely not apply if:
- You want to write Java in one corner of the stack for the next three years
- You think “that's not my service” when something breaks at 2am
- Remote work is non-negotiable (we're in-person in San Francisco for a reason)
What you'll actually do
- Build and run the inference infrastructure that serves Moondream's vision models to thousands of developers
- Own things end-to-end. There is no platform team. There is no SRE team. There is you.
- Design APIs that are boring in the best way: predictable, fast, well-documented
- Figure out how to do more with less. We're not going to outspend the big companies on GPUs, so we have to be smarter about how we use them.
- Probably mass-delete Terraform at some point
Details
- Location: San Francisco, CA (in-person)
- Stack: Python, Node.js, Rust, C++, AWS, Cloudflare, and whatever else the problem calls for
- Compensation: $200k–$270k + meaningful equity
- Benefits: Health, dental, vision, paid parental leave, relocation support
Apply
Send your resume to hiring@moondream.ai