Moondream works better than Gemini in basic robot use cases because it's faster and more accurate.
Who I Am
I'm Yusuf Yıldırım, I'm a 13-year-old developer from Turkey, and I've been into electronics/software development since I was 5 years old.
What I Do
Robotics, electronics, and software development are my main hobbies. I have joined multiple competitions with my projects and won many awards. I have made (3D-designed, printed, and programmed) a lot of robots.
Why Moondream
I love Moondream for multiple reasons. Firstly, I love that the API is so fast and has more generous limits than any model that I found. This allows us to use Moondream on microcontrollers like the ESP32-CAM. I already made a project about it—I solved the API endpoint from the Curl commands from the website and made the Moondream API work with ESP32-CAM. It has some bugs and issues, but it worked! I will share it on my GitHub soon. Anyways, the second thing is that it can run on small devices, even on a Raspberry Pi. Lastly, it's open source. I think I don't even have to explain why that's so good.
My Use Case
I see a lot of possibilities with Moondream, so I don't have just one specific use case. But I will generally use it in robotics. I have made (3D-designed, printed, and programmed) a lot of robots, but their vision capabilities weren't very advanced. I had to add image Q&A so the robots could answer us by "seeing." I did add it, but surprisingly, Moondream works better than Gemini in basic robot use cases (from my experience) because it's faster and more accurate. The Gemini Vision API was also accurate, but the time it took per answer and the answer quality were actually worse than Moondream.
My Story
As I said, I am an 8th-grade student from Turkey. Robotics, electronics, and software development are my main hobbies. I have joined multiple competitions with my projects and won many awards since then. Generally, they were about robots or machines that can "see" using custom YOLO or OpenCV models I made and trained—until VLMs started to improve. For example, banana production is very popular in the area where I currently live (Mersin - Anamur), and I made a YOLO model that detects good and bad bananas for production. I gathered and labeled thousands of images of good and bad ones, trained the model with Google Colab, all when I was just 9 years old (in 2020). This is just one of the thousands of projects I've worked on. I mentioned it because AI vision projects were that hard before models like Moondream existed. Now, I have more than seven companies, some of which are yAI (https://x-y-ai.web.app), VirtualGSMS (https://vgsms.github.io), Factify (https://fact-ify.web.app), and more.