Skip to main content

Convert a Document to Markdown

POST 

/api/v8/partner/document-to-markdown

Veryfi's Document to Markdown endpoint allows you to convert a document to markdown and stores the result. The following document types are supported: .gif, .csv, .bmp, .webp, .xls, .htm, .ofd, .html, .xlsx, .avif, .jpg, .png, .zip, .heif, .jpeg, .pdf, .txt, .heic. The max file size is 20mb, min file size is 0.25kb. Rate limit is 20 requests per second.

Request

Body

    external_id (string | null)

    External reference ID for tracking.

    package_path (string | null)

    Possible values: non-empty

    A path to a file in an S3 bucket, e.g. 'some/receipt.jpg

    bucket (string | null)

    Possible values: non-empty

    An S3 bucket for 'package_path', e.g. 'documents'.

    file_data (string | null)

    Possible values: non-empty

    Used to upload a document via base64 encoded string, could be raw or data URI scheme. This is the least effective way to upload a document for processing. See file_urls or uploading zip files.

    file_url (string | null)

    Possible values: non-empty

    A URL to a publicly accessible document to be sent to Veryfi for processing.

    file_urls string[]

    Possible values: non-empty

    An array of URLs to publicly accessible documents to be sent to Veryfi for processing.

    file_name (string | null)

    Possible values: non-empty

    An optional filename. Useful to determine file type.

    details (boolean | null)

    A field used to determine whether or not to return bounding boxes along with markdown.

    document_type (string | null)

    Default value: document

    Type of document being converted (e.g., 'receipt', 'invoice', 'contract').

    tags string[]

    Tags to attach to the document.

    max_pages_to_process (integer | null)

    Possible values: >= 1 and <= 50

    Default value: 50

    Limit processing to number of pages.

Responses

Returns the document to markdown result with database ID.

Schema
    id integerrequired

    The database ID of the markdown document

    markdown (string | null)required

    The markdown content of the converted document.

    pages object[]

    Page structures returned when details is true.

    document_type stringrequired

    Type of document

    external_id (string | null)

    External reference ID

    status stringrequired

    Processing status

    md_storage_path (string | null)

    S3 path to the markdown document

    pdf_url (string | null)

    S3 URL to the PDF file

    img_url (string | null)

    S3 URL to the image file

    created stringrequired

    Creation timestamp

    updated stringrequired

    Last update timestamp

    tags string[]

    Default value: ``

    Document tags

Loading...