Moondream logo

Powerful visual AI.
Tiny footprint.

Moondream is an open-source visual language model that understands images using simple text prompts. It's fast and wildly capable.

Capabilities

One Model, Many Capabilities

Moondream supports a growing set of visual capabilities — all accessible through natural-language prompts.

Image Captioning

Caption:
The image shows a man in a blue jumpsuit and yellow hard hat standing in a large industrial setting. He is wearing safety glasses and ear protection, and is holding a clipboard and a pen, appearing to be taking notes...
Manufacturing
Compliance
Synthetic Data

Image Captioning

The image shows a man in a blue jumpsuit and yellow hard hat standing in a large industrial setting. He is wearing safety glasses and ear protection, and is holding a clipboard and a pen, appearing to be taking notes...

A worker in PPE inspecting machinery

Visual Question Answering

Query: Is any vehicle unsecured? Describe.
Yes, there is an unsecured truck parked in the area. The truck is filled with boxes, and it appears to be a delivery truck. The presence of the unsecured truck and the boxes suggests that it might be a delivery service or a delivery truck parked in a public area.
Transportation
Security
Agentic AI

Visual Question Answering

Yes, there is an unsecured truck parked in the area. The truck is filled with boxes, and it appears to be a delivery truck. The presence of the unsecured truck and the boxes suggests that it might be a delivery service or a delivery truck parked in a public area.

A delivery van with rear door open

Object Detection

Detect: License plate
(x=0.431, y=0.713, x2=0.569, y2=0.921)
Retail
Inventory
Transportation
Robotics

Object Detection

(x=0.431, y=0.713, x2=0.569, y2=0.921)

A shelf stocked with soda cans, water bottles, and snack bars

Pointing (x, y)

Point: Defect in train tracks.
(x=0.431, y=0.505)
Quality Control
Compliance
Transportation
Defense
Surveillance

Pointing (x, y)

(x=0.431, y=0.505)

Drone view with multiple vehicles near a building

Gaze Detection

Detect Gaze:
The operator is looking at the bottom-right section of the control panel, near the red warning light.
Manufacturing
Safety
Transportation
Retail
Real-world Agentic AI

Gaze Detection

The operator is looking at the bottom-right section of the control panel, near the red warning light.

Operator at a control panel with a flashing alert

OCR & Document Understanding

Query: Transcribe the text in natural reading order.
"Preface, The computing world has undergone a revolution since the publication of The C Programming Language in 1978. Big computers are much bigger, and personal computers have capabilities..."
Logistics
Office Automation
Legal

OCR & Document Understanding

"Preface, The computing world has undergone a revolution since the publication of The C Programming Language in 1978. Big computers are much bigger, and personal computers have capabilities..."

Scanned bill of lading with barcode

Moondream is trusted by

CalPoly
CalPoly
Get Started

Get Running in Minutes.

Moondream is open source and you can install and run it anywhere, for free. You can have it running on your computer or in our cloud in a matter of minutes.

Run It Yourself
  • Moondream Station is free
  • Works with our Python and Node clients
  • Works offline, fully under your control
  • CPU or GPU compatible
Moondream Station
Run in the Cloud
  • No downloads required
  • Free tier: 5,000 requests per day
  • Works with same Python or Node clients
  • Scales to production
Moondream Cloud
Community

Trusted by Developers Everywhere.

Used in real-world applications across retail, logistics, healthcare, defense,and more.

View All Customer Success Stories
Ben Caunt
Software Engineer & Founder
Moondream allowed me to implement semantic behaviors for robotics systems far easier than any other way I could think of doing it.
Aya & Dan Bochman
Founders of FashnAI
We were looking for the fastest VLM model that could still reliably handle our use-case, and moondream nailed it.
Anurag Phadke
Software Engineer
Blazingly fast, clean interface and APIs, extensible, open-source, actually works, friendly website w/ fine-tuning instructions.