DoclingDocling for IBM watsonx
API Reference

Convert Batch

Convert multiple documents, including from cloud storage, using the Docling for IBM watsonx API

Convert Batch

Convert more than one document in a single request. Sources can be web URLs (HTTP/HTTPS) or cloud storage (S3), and results are delivered either as temporary download URLs or written to a cloud storage destination you specify.

Use this endpoint whenever you have more than one document. For a single document, use Convert Source or Convert File.

Endpoint

POST /v1/convert/source/batch

Request Headers

HeaderRequiredDescription
X-Api-KeyYesYour API key for authentication
Content-TypeYesMust be application/json

Request Body

ParameterTypeRequiredDescription
sourcesarrayYesOne or more source objects to convert (at least one)
targetobjectYesWhere converted results are delivered (see Targets)
optionsobjectNoConversion options (see Options)
callbacksarrayNoProgress callback endpoints (see Progress Callbacks)

Sources

Each entry in sources is one of the following kinds. A single cloud storage source can represent many documents.

HTTP source

ParameterTypeRequiredDescription
kindstringYesMust be "http"
urlstringYesThe URL of the document to convert
headersobjectNoAdditional request headers used to fetch the URL (for example, authorization)

S3 source

Reads objects from an S3-compatible bucket. Every object under bucket/key_prefix is converted, up to max_num_elements.

ParameterTypeRequiredDescription
kindstringYesMust be "s3"
endpointstringYesS3 service endpoint, without protocol (for example, s3.us-east-2.amazonaws.com)
access_keystringYesS3 access key
secret_keystringYesS3 secret key
bucketstringYesBucket name to read from
key_prefixstringNoPrefix for the object keys to read. Defaults to empty (the whole bucket)
verify_sslbooleanNoUse SSL to connect to S3. Defaults to true
max_num_elementsintegerNoMaximum number of objects to read from this source. Defaults to no limit

Targets

The target determines where converted results are delivered. The choice is constrained by your sources:

SourcesAllowed targetResult
All HTTPpresigned_url or s3Temporary download URLs, or files written to your bucket
Any S3 sources3 (required)Files written to your destination bucket

If any source reads from cloud storage, you must provide an s3 target. A web-only batch can use either target.

Presigned URL target

ParameterTypeRequiredDescription
kindstringYesMust be "presigned_url"

The result contains one entry per document, each with a download URL for every requested output format.

S3 target

Writes the converted outputs to an S3-compatible bucket you specify. This can be any bucket; it does not have to be a source bucket, and source objects are never modified.

ParameterTypeRequiredDescription
kindstringYesMust be "s3"
endpointstringYesS3 service endpoint, without protocol
access_keystringYesS3 access key
secret_keystringYesS3 secret key
bucketstringYesDestination bucket for converted outputs
key_prefixstringNoPrefix for the written object keys. Defaults to empty
verify_sslbooleanNoUse SSL to connect to S3. Defaults to true

Options

The options object supports the following parameters:

ParameterTypeDefaultDescription
to_formatsarray["md"]Output formats, any of: "md", "html", "json", "text", "doclang"

Response

Success Response (200 OK)

Returns a task object that can be used to poll for completion:

{
  "task_id": "{TASK_ID}",
  "task_type": "convert",
  "task_status": "pending",
  "task_position": 1,
  "task_meta": null,
  "failure": null,
  "error_message": null
}

See Convert Source for the task field descriptions.

Retrieving Results

Poll /v1/status/poll/{task_id} until the status is success, then call /v1/result/{task_id}. The result shape depends on the target:

  • presigned_url target — one entry per document with download URLs, plus result counters. See Get Results.
  • s3 target — converted files are written to your destination bucket; the result is a summary of counts:
{
  "num_converted": 312,
  "num_succeeded": 310,
  "num_partially_succeeded": 1,
  "num_failed": 1,
  "processing_time": 842.5
}
FieldTypeDescription
num_convertedintegerNumber of documents processed
num_succeededintegerNumber converted successfully
num_partially_succeededintegerNumber converted with partial success
num_failedintegerNumber that failed conversion
processing_timenumberTotal processing time in seconds

Progress Callbacks

For long-running batches, you can have the service notify your own endpoint instead of polling. Each entry in callbacks describes a webhook:

ParameterTypeRequiredDescription
urlstringYesURL the service POSTs progress updates to
headersobjectNoAdditional headers to include on callback requests
ca_certstringNoCustom CA certificate (PEM) for verifying the callback endpoint

Each callback request has the shape:

{
  "task_id": "{TASK_ID}",
  "progress": {
    "kind": "update_processed",
    "num_processed": 120,
    "num_succeeded": 118,
    "num_partially_succeeded": 1,
    "num_failed": 1,
    "docs": [
      { "source": "incoming/report.pdf", "status": "success", "error": null }
    ]
  }
}

The progress object's kind is one of:

KindDescription
set_num_docsSent once the total number of documents is known (num_docs)
update_processedRunning totals plus the documents processed in this update (docs)
document_completedSent after each document, with per-document detail (pages, tables, timing) and overall progress

Error Responses

See Error Handling for the full error model.

StatusDescription
400Bad request — malformed request
401Unauthorized — missing or invalid API key
422Validation error — invalid request shape, or an S3 source paired with a non-S3 target
429Too many requests — rate limit exceeded
500Internal server error
502Gateway or upstream error

Examples

Find examples of using the batch endpoint in Converting Multiple Documents and Batch Conversion.

On this page