Moondream 2025-03-27 Release
March 28, 2025
We're excited to announce a new Moondream release. There's a lot to unpack, but we'll give you the highlights here, and share more over the coming weeks. The improvements in this release were driven from real-world usage and feedback from our community and customers. We want to extend a huge thank you to everyone who contributed to that. Keep it coming, let's gooooo!
Longer
The ability to caption images is one of the top Moondream use cases. Super accurate captions from a fast, efficient, and easy to run model seems to be a winning combo! Use cases range from synthetic data generation to real-world understanding and robotics.
Until now, Moondream offered a choice of "Short" or "Normal" length captions. This model introduces "Long" format. From our testing this generates roughly 2x longer captions than "Normal".

Better
We're still doing exhaustive evals, but so far we've seen major improvements on object detection (COCO mAP), OCR (OCRBench), and counting (CountBenchQA):

We had some customers request improvements in our object detection capability. We were excited to work on that, and we're especially pleased with the results. This new COCO mAP score now makes Moondream near state of the art on object detection.
We also had a customer with a specific need: the ability to tag all the things visible in an image. While we don't have a public benchmark available to highlight, our internal benchmark and vibe checks shows a huge improvement in this ability. We call it "Image Tagging", and you can try it out by using this prompt in your image query: "List all visible objects, features, and characteristics of this image. Return the result as a JSON array." Here's an example of how it works.

Faster
We have client update planned in the next few weeks that includes mobile, but in the meantime, we snuck in one key improvement to our transformers-based client. It just got a lot faster. From our testing , calling "compile()" on the model sped up inference from 61.4 tok/s to 123.4 tok/s on an Nvidia 3090. That not only makes it cheaper to run, it also opens up more possibilities for near-realtime processing, especially for video streaming.
Stronger
Moondream's improvements are driven by the feedback and engagement of its growing community and customers. This is creating a flywheel effect, where the feedback and requests from the community drive us to make more improvements the model, and these improvements drive more adoption from a ever-growing community. In other words, you're part of the reason Moondream keeps growing. We extend our tip of the hat to you, our Moondreamers.
Conclusion
We'll be sharing more details over the next few weeks, but the great news is that you don't have to wait. You can download the model now, or go kick its tires in our playground, or even better yet, build something with our free cloud offering.