- Documentation
- Advanced
Point API
The /point
endpoint provides precise coordinate locations for specific objects in images. Unlike /detect
which returns bounding boxes, this endpoint returns center points for each instance of the specified object.
Endpoint
POST https://api.moondream.ai/v1/point
Request Format
Parameter | Type | Required | Description |
---|---|---|---|
image_url | string | Yes | Base64 encoded image with data URI prefix (e.g., "data:image/jpeg;base64,..." ) |
object | string | Yes | The type of object to locate (e.g., "person" , "car" , "face" ) |
Streaming Support
This endpoint does not support streaming responses.
Response Format
{
"request_id": "2025-03-25_point_2025-03-25-21:00:39-715d03",
"points": [
{
"x": 0.65, // x coordinate (normalized 0-1)
"y": 0.42 // y coordinate (normalized 0-1)
},
// Additional points...
]
}
Coordinate System
Coordinates are normalized to the image dimensions, ranging from 0 to 1:
- (0,0) is the top-left corner of the image
- (1,1) is the bottom-right corner of the image
To convert to pixel coordinates, multiply by the image dimensions:
- pixel_x = x * image_width
- pixel_y = y * image_height
Examples
import moondream as md
from PIL import Image
import matplotlib.pyplot as plt
# Initialize with API key
model = md.vl(api_key="your-api-key")
# Load an image
image = Image.open("path/to/image.jpg")
# Locate objects (e.g., "person", "car", "face", etc.)
result = model.point(image, "face")
points = result["points"]
request_id = result["request_id"]
print(f"Found {len(points)} faces")
print(f"Request ID: {request_id}")
# Visualize the points
plt.figure(figsize=(10, 10))
plt.imshow(image)
for point in points:
# Convert normalized coordinates to pixel values
x = point["x"] * image.width
y = point["y"] * image.height
# Plot the point
plt.plot(x, y, 'ro', markersize=15, alpha=0.7)
plt.text(
x + 10, y, "Face",
color='white', fontsize=12,
bbox=dict(facecolor='red', alpha=0.5)
)
plt.axis('off')
plt.savefig("output_with_points.jpg")
plt.show()
Use Cases
- Interactive image experiences
- Accurate object counting
- Heat map generation
- Creating image annotations
- Precise focus point identification for photography applications
Difference Between /point and /detect
Point vs. Detect
/point
returns center coordinates only (x, y)/detect
returns bounding boxes (x_min, y_min, x_max, y_max)
Use /point
when you need:
- Precise object center points
- Simpler data for plotting
- Multiple instance counting
Use /detect
when you need:
- Object size information
- Visual highlighting of complete objects
- Cropping regions of interest
Common Object Types
Moondream can locate a wide range of objects. Here are some commonly used examples:
- person
- face
- car
- dog
- cat
- building
- furniture
- text
- food
- plant
Zero-Shot Detection
Like object detection, Moondream's pointing is zero-shot, meaning it can locate virtually any object you specify, not just from a predefined list. Try describing the object as specifically as possible for best results.
Error Handling
Common error responses:
Status Code | Description |
---|---|
400 | Bad Request - Invalid parameters or image format |
401 | Unauthorized - Invalid or missing API key |
413 | Payload Too Large - Image size exceeds limits |
429 | Too Many Requests - Rate limit exceeded |
500 | Internal Server Error - Server-side issue |
Error Response Format
Error responses are returned in the following format:
{
"error": {
"message": "Detailed error description",
"type": "error_type",
"param": "parameter_name",
"code": "error_code"
}
}
Limitations
- Maximum image size: 10MB
- Supported image formats: JPEG, PNG, GIF (first frame only)
- Detection works best on clearly visible objects
- Multiple small objects may be more challenging to locate
- Rate limits apply based on your plan