/point
The /point
endpoint provides precise coordinate locations for specific objects in images. Unlike /detect
which returns bounding boxes, this endpoint returns center points for each instance of the specified object.
Endpoint
POST https://api.moondream.ai/v1/point
Request Format
Parameter | Type | Required | Description |
---|---|---|---|
`image_url` | string | Yes | Base64 encoded image with data URI prefix (e.g., `"data:image/jpeg;base64,..."`) |
`object` | string | Yes | The type of object to locate (e.g., "person", "car", "face") |
Streaming Support
This endpoint does not support streaming responses.
Response Format
{"request_id": "2025-03-25_point_2025-03-25-21:00:39-715d03","points": [ { "x": 0.65, // x coordinate (normalized 0-1) "y": 0.42 // y coordinate (normalized 0-1) }, // Additional points...]}
Coordinate System
Coordinates are normalized to the image dimensions, ranging from 0 to 1:
- (0,0) is the top-left corner of the image
- (1,1) is the bottom-right corner of the image
To convert to pixel coordinates, multiply by the image dimensions:
- pixel_x = x * image_width
- pixel_y = y * image_height
Examples
import moondream as mdfrom PIL import Imageimport matplotlib.pyplot as plt # Initialize with API keymodel = md.vl(api_key="your-api-key") # Load an imageimage = Image.open("path/to/image.jpg") # Locate objects (e.g., "person", "car", "face", etc.)result = model.point(image, "face")points = result["points"]request_id = result["request_id"]print(f"Found {len(points)} faces")print(f"Request ID: {request_id}") # Visualize the pointsplt.figure(figsize=(10, 10))plt.imshow(image) for point in points: # Convert normalized coordinates to pixel values x = point["x"] * image.width y = point["y"] * image.height # Plot the point plt.plot(x, y, 'ro', markersize=15, alpha=0.7) plt.text( x + 10, y, "Face", color='white', fontsize=12, bbox=dict(facecolor='red', alpha=0.5) ) plt.axis('off')plt.savefig("output_with_points.jpg")plt.show()
Use Cases
- Interactive image experiences
- Accurate object counting
- Heat map generation
- Creating image annotations
- Precise focus point identification for photography applications
Difference Between /point and /detect
Point vs. Detect
/point
returns center coordinates only (x, y)/detect
returns bounding boxes (x_min, y_min, x_max, y_max)
Use /point
when you need:
- Precise object center points
- Simpler data for plotting
- Multiple instance counting
Use /detect
when you need:
- Object size information
- Visual highlighting of complete objects
- Cropping regions of interest
Common Object Types
Moondream can locate a wide range of objects. Here are some commonly used examples:
- person
- face
- car
- dog
- cat
- building
- furniture
- text
- food
- plant
Zero-Shot Detection
Like object detection, Moondream's pointing is zero-shot, meaning it can locate virtually any object you specify, not just from a predefined list. Try describing the object as specifically as possible for best results.
Error Handling
Common error responses:
Status Code | Description |
---|---|
400 | Bad Request - Invalid parameters or image format |
401 | Unauthorized - Invalid or missing API key |
413 | Payload Too Large - Image size exceeds limits |
429 | Too Many Requests - Rate limit exceeded |
500 | Internal Server Error - Server-side issue |
Error Response Format
Error responses are returned in the following format:
{"error": { "message": "Detailed error description", "type": "error_type", "param": "parameter_name", "code": "error_code"}}
Limitations
- Maximum image size: 10MB
- Supported image formats: JPEG, PNG, GIF (first frame only)
- Detection works best on clearly visible objects
- Multiple small objects may be more challenging to locate
- Rate limits apply based on your plan