Skip to main content
curl -X POST https://api.doctly.ai/api/v1/e/invoice-extractor \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@invoice.pdf"
POST /api/v1/e/{slug} Run a custom extractor on a document file. Extractors are pre-configured pipelines that extract specific data from documents and return structured output (JSON, CSV, XML, or Markdown).

Request

Headers

Authorization
string
required
Bearer token authentication. Example: Bearer YOUR_API_KEY

Path Parameters

slug
string
required
The unique slug identifier of the extractor (e.g., invoice-extractor)

Body Parameters

file
file
The document file to process. Supported formats: PDF, DOCX, PNG, JPG, JPEG, WEBP, GIF.
Provide either file or url, not both.
url
string
URL to download the document from. The file will be fetched and processed.
Provide either file or url, not both.
callback_url
string
Webhook URL to receive a POST request when extraction completes.

Example Request

curl -X POST https://api.doctly.ai/api/v1/e/invoice-extractor \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@invoice.pdf"

From URL

curl -X POST https://api.doctly.ai/api/v1/e/invoice-extractor \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "url=https://example.com/invoice.pdf"

Response

Example Responses

{
  "id": "123e4567-e89b-12d3-a456-426614174000",
  "file_name": "invoice.pdf",
  "file_size": 524288,
  "page_count": 2,
  "status": "PENDING",
  "extractor_id": "987fcdeb-a654-3210-9876-543210987654",
  "extractor": {
    "id": "987fcdeb-a654-3210-9876-543210987654",
    "name": "Invoice Extractor",
    "slug": "invoice-extractor",
    "cost_type": "PER_PAGE",
    "cost_credits": 5
  },
  "created_at": "2024-03-21T13:45:00Z"
}

Webhooks

If callback_url is provided, you’ll receive a POST request when extraction completes:
{
  "document_id": "123e4567-e89b-12d3-a456-426614174000",
  "status": "COMPLETED",
  "file_name": "invoice.pdf",
  "extractor": {
    "id": "987fcdeb-a654-3210-9876-543210987654",
    "name": "Invoice Extractor",
    "slug": "invoice-extractor"
  }
}

Polling for Results

After running an extractor, poll Get Document until status is COMPLETED:
DOC_ID="123e4567-e89b-12d3-a456-426614174000"
while true; do
  RESP=$(curl -s https://api.doctly.ai/api/v1/documents/$DOC_ID \
    -H "Authorization: Bearer YOUR_API_KEY")
  STATUS=$(echo $RESP | jq -r '.status')
  echo "Status: $STATUS"
  if [ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ]; then
    echo $RESP | jq -r '.output_file_url'
    break
  fi
  sleep 5
done

Extractor Output Formats

Extractors can output data in different formats:
FormatContent-TypeDescription
JSONapplication/jsonStructured data as JSON object
CSVtext/csvTabular data as CSV
XMLapplication/xmlStructured data as XML
Markdowntext/markdownFormatted text as Markdown
The output format is determined by the extractor configuration.
Each extractor is designed for specific document types. Using an invoice extractor on a resume may produce incomplete or incorrect results.