Moondream
  • Docs
  • Playground
  • Pricing
  • Blog
9.6kTry the model
← Back to all blogs

Categories

Showcase
Announcement
Model Release
Engineering
Release

Engineering

Blog posts in the Engineering category

Popping the GPU Bubble
Engineering
June 4, 2026
Popping the GPU Bubble
Photon, Moondream's inference engine, achieves near-realtime VLM inference (~33ms on NVIDIA B200). This is a peek into how it delivers up to 35% higher decode throughput by optimizing how the GPU works.
Read more
Moondream

A production vision language model. Built by M87 Labs.

Product
Open ModelsLensPhotonMoondream CloudSupport
Developers
DocsPlaygroundGitHubHugging Face
Company
AboutBlogCareersPressContact
Legal
PrivacyTermsModel license
© 2026 M87 Labs, Inc.Proudly built in the USA