Moondream Station is our free on-prem client for Moondream. A one-click (or one command) installer makes it a snap to get Moondream running on your Mac, PC, or Linux box instantly. Today we're happy to announce Moondream 3 Preview support on Mac. Try it out for yourself (works on mac, windows, linux):
pip install moondream-station
Built for Apple Silicon
To get the most out of Apple Silicon, we built Mac inference to be fully MLX native and added quantized Moondream 3 support. The result is snappy performance. You'll need a Mac with at least 16GB of memory. On an M1 Max with 64GB, we are seeing over 35 tokens per second. Here's a demo of how it works on that M1 Max:
Moondream uses dedicated grounding tokens, so any x or y coordinate only requires one token. This means inferences for grounded skills like point or detect feel near instantaneous.
Using the API
Once Moondream Station is running, you can connect to it using our Python client:
# pip install moondream
import moondream as md
from PIL import Image
# Connect to Moondream Station
model = md.vl(endpoint="http://localhost:2020/v1")
# Load an image
image = Image.open("path/to/image.jpg")
# Ask a question
answer = model.query(image, "What's in this image?")["answer"]
print("Answer:", answer)
What's next
We are planning to make more improvements to Moondream Station over the next few weeks. If you have ideas or requests, reach out on Discord.
Happy holidays from the Moondream team.



