Rock Paper Scissors
Classify hand gestures as rock, paper, or scissors. With only 5 training examples per class and 50 RL steps, accuracy jumps from 54.8% to 98.8%. This shows that simple tasks can be perfected with just a few examples to learn from.
Accuracy
| Method | RL |
| Steps | 50 |
| Training time | 23 min |
| Cost | $7.84 |
See it in action
Compare the base model against the fine-tuned model across representative benchmark examples.
Prompt
Is this rock, paper, or scissors? Respond with rock, paper, or scissors only.
Scissors

Base model
Incorrectpaper
Fine-tuned model
Correctscissors
Paper

Base model
Incorrectrock
Fine-tuned model
Correctpaper
Rock

Base model
Incorrectpaper
Fine-tuned model
Correctrock
Perfection in 3 steps
What is fine-tuning?
Moondream starts as a general model trained on broad, public information. Fine-tuning makes it great at one specific task by teaching it the products, documents, categories, or internal information that matter to your business.
Who is this for?
This is for teams putting vision AI into production. If you already know the task and need the model to master that job, fine-tuning is how you get there. It is built for teams that need frontier performance at real-time speed.
See the code
Fine-tuning is just a small API loop: format your data, call `train_step`, and the model updates as you go.
import moondream as md
# Create fine-tune
ft = md.ft(
api_key="your-api-key",
name="Rock Paper Scissors",
rank=8,
)
# Hidden boilerplate and data code
requests = (
(
example,
{
"skill": "query",
"image": example["image"],
"question": "Is this rock, paper, or scissors? Respond with rock, paper, or scissors only.",
"num_rollouts": 4,
},
)
for example in training_data
)
for context, response in ft.rollout_stream(requests):
rewards = compute_rewards(context, response)
ft.train_step([{
"mode": "rl",
"request": response["request"],
"rollouts": response["rollouts"],
"rewards": rewards,
}])Frequently asked questions
Ready to take Moondream to production?
Need help? We'll build it for you.
We can help define the task, prepare the data, run training, validate results, and hand off a model your team can use.