Point API

The /point endpoint provides precise coordinate locations for specific objects in images. Unlike /detect which returns bounding boxes, this endpoint returns center points for each instance of the specified object.

Endpoint

POST https://api.moondream.ai/v1/point

Request Format

ParameterTypeRequiredDescription
image_urlstringYesBase64 encoded image with data URI prefix (e.g., "data:image/jpeg;base64,...")
objectstringYesThe type of object to locate (e.g., "person", "car", "face")
Streaming Support

This endpoint does not support streaming responses.

Response Format

{
  "request_id": "2025-03-25_point_2025-03-25-21:00:39-715d03",
  "points": [
    {
      "x": 0.65,      // x coordinate (normalized 0-1)
      "y": 0.42       // y coordinate (normalized 0-1)
    },
    // Additional points...
  ]
}
Coordinate System

Coordinates are normalized to the image dimensions, ranging from 0 to 1:

  • (0,0) is the top-left corner of the image
  • (1,1) is the bottom-right corner of the image

To convert to pixel coordinates, multiply by the image dimensions:

  • pixel_x = x * image_width
  • pixel_y = y * image_height

Examples

import moondream as md
from PIL import Image
import matplotlib.pyplot as plt
 
# Initialize with API key
model = md.vl(api_key="your-api-key")
 
# Load an image
image = Image.open("path/to/image.jpg")
 
# Locate objects (e.g., "person", "car", "face", etc.)
result = model.point(image, "face")
points = result["points"]
request_id = result["request_id"]
print(f"Found {len(points)} faces")
print(f"Request ID: {request_id}")
 
# Visualize the points
plt.figure(figsize=(10, 10))
plt.imshow(image)
 
for point in points:
    # Convert normalized coordinates to pixel values
    x = point["x"] * image.width
    y = point["y"] * image.height
    
    # Plot the point
    plt.plot(x, y, 'ro', markersize=15, alpha=0.7)
    plt.text(
        x + 10, y, "Face", 
        color='white', fontsize=12,
        bbox=dict(facecolor='red', alpha=0.5)
    )
 
plt.axis('off')
plt.savefig("output_with_points.jpg")
plt.show()

Use Cases

  • Interactive image experiences
  • Accurate object counting
  • Heat map generation
  • Creating image annotations
  • Precise focus point identification for photography applications

Difference Between /point and /detect

Point vs. Detect
  • /point returns center coordinates only (x, y)
  • /detect returns bounding boxes (x_min, y_min, x_max, y_max)

Use /point when you need:

  • Precise object center points
  • Simpler data for plotting
  • Multiple instance counting

Use /detect when you need:

  • Object size information
  • Visual highlighting of complete objects
  • Cropping regions of interest

Common Object Types

Moondream can locate a wide range of objects. Here are some commonly used examples:

  • person
  • face
  • car
  • dog
  • cat
  • building
  • furniture
  • text
  • food
  • plant
Zero-Shot Detection

Like object detection, Moondream's pointing is zero-shot, meaning it can locate virtually any object you specify, not just from a predefined list. Try describing the object as specifically as possible for best results.

Error Handling

Common error responses:

Status CodeDescription
400Bad Request - Invalid parameters or image format
401Unauthorized - Invalid or missing API key
413Payload Too Large - Image size exceeds limits
429Too Many Requests - Rate limit exceeded
500Internal Server Error - Server-side issue
Error Response Format

Error responses are returned in the following format:

{
  "error": {
    "message": "Detailed error description",
    "type": "error_type",
    "param": "parameter_name",
    "code": "error_code"
  }
}

Limitations

  • Maximum image size: 10MB
  • Supported image formats: JPEG, PNG, GIF (first frame only)
  • Detection works best on clearly visible objects
  • Multiple small objects may be more challenging to locate
  • Rate limits apply based on your plan