Pricing
Simple, transparent pricing for every stage of growth.
- 1 seat
- $5/mo free usage included
- 2 deployed finetunes
- Moondream Station
- Discord support
- 10 seats
- $50/mo usage included
- 25 deployed finetunes
- Moondream Station
- Email support 3 business day SLA
Scale
For larger organizations scaling inference and RL.
per month, billed annually
- 50 seats
- $100/mo usage included
- 200 deployed finetunes
- Moondream Station Pro
- Dedicated Slack 1 business day SLA
- HIPAA available
Enterprise
For advanced compliance and deployment needs.
- Unlimited seats
- Custom usage
- Unlimited deployed finetunes
- Moondream Station Pro
- Custom-tailored support
- HIPAA, audit logs, and SSO
Moondream Station is our free, open-source local inference package with PyTorch and MLX support. Moondream Station Pro unlocks access to our high-performance inference engine — the same one powering our cloud API — featuring custom CUDA kernels, continuous batching, and up to 5x throughput over standard PyTorch. Scale plans include one device license, with additional devices available at a per-device fee.
Usage-based pricing
Pay only for what you use. All plans include monthly usage credits.
Moondream 3 (Preview)
| Input (1M tokens) | Output (1M tokens) | ||
|---|---|---|---|
| Real-time | Base model | $0.3000 | $2.5000 |
Finetune+25% | $0.3750 | $3.1250 | |
Batch-50% | Base model | $0.1500 | $1.2500 |
Finetune+25% | $0.1875 | $1.5625 | |
FinetuningFree during preview | Rollout generation | $0.3000 Free | $2.5000 Free |
| Training | $2.5000 Free | — |
Each image consumes 729 tokens. Text is tokenized using a superword tokenizer. Try the playground to see token usage for your requests.