Better answers to questions about your images. Classify by your categories, read your forms, recognize your products.
Captions in your style and voice. Describe what matters for your use case, skip what doesn't.
Find the objects you care about, ignore the rest. Cut false positives down to near zero.
More accurate clicks. Better grounding for agents and UI automation.
Pick the method that fits the data you have.
Show, don't tell.
Give Moondream input/output pairs and it learns to match them. Best for teaching domain-specific concepts or when you already have a dataset.
- Classification with a small set of categories
- Captioning in a fixed style or voice
- Detection with bounding boxes
- Structured outputs and form parsing
Reward what works.
Give Moondream a task and score its answer variations. It learns which ones score higher. Best when the model is already somewhat proficient, or when you only have a few examples. Works with as few as 20.
- Reasoning and multi-step tasks
- Open-ended outputs with many valid answers
- Cases where you can verify correctness automatically
- Optimizing directly for a metric
Not sure? Send 10 examples and we'll tell you which method to use.
One fine-tune, that runs everywhere.
Closed APIs lock your fine-tune to their endpoint. Open frameworks make you build the training stack yourself. Lens trains the model for you, then lets you serve it from our cloud or run it on your own hardware with Photon.
Bring your hardest task. We'll prove it works.
Send us 10 labeled examples of your task. We will return a fine-tuned Moondream that does it better than the base model. If it does not, you owe us nothing.
Example fine-tunes based on real customer use cases.
Player with Ball Detection
Detect the player holding the basketball in NBA broadcast footage.
State Farm Logo Detection
Detect State Farm logos in NBA broadcast frames.
GeoGuessr Countries
Predict the country from a single street-view image.
Rock Paper Scissors
Classify hand gestures with 5 examples per class.
Glaucoma Detection
Classify retinal images by glaucoma stage.
Computer Use
Click the correct UI element from a screenshot and instruction.