Point API

The /point endpoint provides precise coordinate locations for specific objects in images. Unlike /detect which returns bounding boxes, this endpoint returns center points for each instance of the specified object.

Endpoint

POST https://api.moondream.ai/v1/point

Request Format

Parameter	Type	Required	Description
`image_url`	string	Yes	Base64 encoded image with data URI prefix (e.g., `"data:image/jpeg;base64,..."`)
`object`	string	Yes	The type of object to locate (e.g., `"person"`, `"car"`, `"face"`)

Streaming Support

This endpoint does not support streaming responses.

Response Format

{
  "request_id": "2025-03-25_point_2025-03-25-21:00:39-715d03",
  "points": [
    {
      "x": 0.65,      // x coordinate (normalized 0-1)
      "y": 0.42       // y coordinate (normalized 0-1)
    },
    // Additional points...
  ]
}

Coordinate System

Coordinates are normalized to the image dimensions, ranging from 0 to 1:

(0,0) is the top-left corner of the image
(1,1) is the bottom-right corner of the image

To convert to pixel coordinates, multiply by the image dimensions:

pixel_x = x * image_width
pixel_y = y * image_height

Examples

import moondream as md
from PIL import Image
import matplotlib.pyplot as plt
 
# Initialize with API key
model = md.vl(api_key="your-api-key")
 
# Load an image
image = Image.open("path/to/image.jpg")
 
# Locate objects (e.g., "person", "car", "face", etc.)
result = model.point(image, "face")
points = result["points"]
request_id = result["request_id"]
print(f"Found {len(points)} faces")
print(f"Request ID: {request_id}")
 
# Visualize the points
plt.figure(figsize=(10, 10))
plt.imshow(image)
 
for point in points:
    # Convert normalized coordinates to pixel values
    x = point["x"] * image.width
    y = point["y"] * image.height
    
    # Plot the point
    plt.plot(x, y, 'ro', markersize=15, alpha=0.7)
    plt.text(
        x + 10, y, "Face", 
        color='white', fontsize=12,
        bbox=dict(facecolor='red', alpha=0.5)
    )
 
plt.axis('off')
plt.savefig("output_with_points.jpg")
plt.show()

Use Cases

Interactive image experiences
Accurate object counting
Heat map generation
Creating image annotations
Precise focus point identification for photography applications

Difference Between /point and /detect

Point vs. Detect

/point returns center coordinates only (x, y)
/detect returns bounding boxes (x_min, y_min, x_max, y_max)

Use /point when you need:

Precise object center points
Simpler data for plotting
Multiple instance counting

Use /detect when you need:

Object size information
Visual highlighting of complete objects
Cropping regions of interest

Common Object Types

Moondream can locate a wide range of objects. Here are some commonly used examples:

person
face
car
dog
cat
building
furniture
text
food
plant

Zero-Shot Detection

Like object detection, Moondream's pointing is zero-shot, meaning it can locate virtually any object you specify, not just from a predefined list. Try describing the object as specifically as possible for best results.

Error Handling

Common error responses:

Status Code	Description
400	Bad Request - Invalid parameters or image format
401	Unauthorized - Invalid or missing API key
413	Payload Too Large - Image size exceeds limits
429	Too Many Requests - Rate limit exceeded
500	Internal Server Error - Server-side issue

Error Response Format

Error responses are returned in the following format:

{
  "error": {
    "message": "Detailed error description",
    "type": "error_type",
    "param": "parameter_name",
    "code": "error_code"
  }
}

Limitations

Maximum image size: 10MB
Supported image formats: JPEG, PNG, GIF (first frame only)
Detection works best on clearly visible objects
Multiple small objects may be more challenging to locate
Rate limits apply based on your plan

On This Page

Endpoint
Request Format
Response Format
Examples
Use Cases
Difference Between /point and /detect
Common Object Types
Error Handling
Limitations