How This Helps
Incrementally add new media (images or videos) to an already-indexed dataset without re-processing everything from scratch. New media is processed as an independent batch and becomes immediately visible. When ready, trigger a full reindex to re-cluster the entire dataset.
Use
status_new for all status checks. The status field is being retired. See Retrieve Dataset Status.Prerequisites
Before calling the Add Media API, ensure:- Dataset status is
READYorPARTIAL INDEX - Dataset has an embedding config — the dataset must have been indexed at least once with an embedding model so new media uses the same model for consistency
- Authentication — you need a valid JWT token or session cookie (see Authentication)
You can verify your dataset status using the Retrieve Dataset Status endpoint before attempting to add media.
API Endpoint
Media Sources
Exactly one media source must be provided per request:| Source | Form Field | Description |
|---|---|---|
| S3 Folder (Recommended) | s3_uri | Path to an S3 bucket folder containing media files |
| S3 Manifest | s3_uri | Path to a .csv, .parquet, or .txt manifest file listing media |
| Direct Upload | files | One or more files uploaded via multipart form |
| Archive Upload | archive | A single .zip, .tar, or .tar.gz archive |
Optional Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
auto_reindex | boolean | false | Automatically run a full reindex after the partial update completes |
assume_role | string | null | AWS IAM role ARN to assume for cross-account S3 access |
batch_n_videos | integer | null | Override auto-calculated video count (for resource allocation) |
batch_n_images | integer | null | Override auto-calculated image count (for resource allocation) |
batch_n_objects | integer | null | Override auto-calculated object count (for resource allocation) |
use_spot | boolean | null | Allow pod scheduling on spot instances (cloud only) |
Recommended: Add Media from S3 Bulk Path
For production workflows, using an S3 folder is the recommended approach. Points3_uri to a folder in your S3 bucket containing images and/or videos.
Step 1: Identify your dataset ID
Find your dataset ID from the Visual Layer UI (it’s in the browser URL when viewing a dataset:https://app.visual-layer.com/dataset/<dataset_id>/data), or list your datasets via the API:
Step 2: Verify dataset is ready
"status_new": "READY" or "status_new": "PARTIAL INDEX".
Step 3: Add media from S3
Working Example
The following example was tested against a live Visual Layer Cloud environment and demonstrates adding 5 TikTok videos from an S3 bucket to an existing dataset.Add videos from S3
HTTP 202 Accepted (empty body)
Poll dataset status
After submitting, the dataset transitions toUPDATING / READ ONLY while processing:
PARTIAL INDEX, indicating new media has been added but the dataset has not yet been re-clustered.
Add Media with Auto-Reindex
If you want the dataset to be fully re-clustered automatically after the new media is processed, setauto_reindex=true:
READY when complete.
Manual Reindex
If you added media withoutauto_reindex, the dataset enters PARTIAL INDEX status. You can trigger a manual reindex when ready:
HTTP 202 Accepted
The reindex endpoint only accepts datasets in
PARTIAL INDEX status. If the dataset is still processing (READ ONLY), wait for it to finish before triggering reindex.Alternative: Direct File Upload
Upload individual files directly via multipart form. Use thefiles field (not files[]):
Alternative: Archive Upload
Upload a single archive file:Cross-Account S3 Access
If your S3 data is in a different AWS account, use theassume_role parameter:
Python Example
Response Codes
See Error Handling for the error response format and Python handling patterns.Add Media (POST /api/v1/dataset/{dataset_id}/add_media)
| HTTP Code | Status | Meaning | Common Cause |
|---|---|---|---|
| 202 | Accepted | Processing started successfully | Request valid, pipeline triggered asynchronously |
| 400 | Bad Request | Invalid request parameters | Missing embedding config, invalid S3 URI, no media source provided, or multiple media sources in one request |
| 403 | Forbidden | Feature disabled or insufficient permissions | ADD_MEDIA_ENABLED is false, or user lacks write access to the dataset |
| 404 | Not Found | Dataset not found | Dataset does not exist, or the authenticated user does not have access to it |
| 409 | Conflict | Dataset state incompatible | Dataset status is not READY or PARTIAL INDEX, or another add media / reindex operation is already running |
| 500 | Internal Server Error | Server-side failure | File upload to S3 failed, or pipeline trigger failed |
Reindex (POST /api/v1/dataset/{dataset_id}/reindex)
| HTTP Code | Status | Meaning | Common Cause |
|---|---|---|---|
| 202 | Accepted | Reindex started successfully | Request valid, reindex pipeline triggered |
| 404 | Not Found | Dataset not found | Dataset does not exist or user lacks access |
| 409 | Conflict | Dataset state incompatible | Dataset status is not PARTIAL INDEX |
| 500 | Internal Server Error | Server-side failure | Pipeline trigger failed |
Error Response Format
Error responses return a JSON body with adetail field:
| Error Message | HTTP Code | What to Do |
|---|---|---|
"Dataset not found" | 404 | Check the dataset ID and your access permissions |
"Dataset {id} has no embedding_config..." | 400 | The dataset must be fully indexed at least once before adding media |
"Exactly one media source must be provided: files[], s3_uri, or archive" | 400 | Provide exactly one media source per request |
"Invalid S3 URI or no media files found at {uri}" | 400 | Check the S3 path exists and contains supported media files |
"Operation 'add_media' is blocked while MEDIA_ADDITION task is running" | 409 | Wait for the current operation to finish before starting another |
"Add media feature is not enabled" | 403 | Contact your administrator to enable the add media feature |
Dataset Status Flow
After calling add media, the dataset goes through these status transitions:| Phase | status_new |
|---|---|
| Before | READY |
| Processing | READ ONLY |
| Awaiting Reindex | PARTIAL INDEX |
| Reindexing | READ ONLY → INDEXING |
| Complete | READY |
On-Premises (Docker Compose)
For on-premises installations, use the pipeline service endpoint:On-premises add media uses local file paths instead of S3 URIs. The
path parameter must be an absolute path accessible from the pipeline container.Related Resources
Saved Views API
Create monitored views that automatically evaluate new media as it arrives.
Notifications API
Retrieve alerts generated when new media matches saved view filters.
Monitoring and Alerts
Understand how adding media triggers monitoring evaluation and alert delivery.
Task Manager API
Track add media and reindex tasks programmatically.