Document Processing
How to process a document by content type
A user can submit documents of different content types for processing. Below you will find several different methods to submit a document for processing:
- JPEG/JPG, PNG, and PDF are some of the accepted file formats.
- You can submit a zip containing all the related files to process multiple files as a single document.
- Check out the PDF Splitter API to process a PDF with multiple documents.
- See file requirements to learn how accuracy depends on the image's quality and clarity.
Process document using multipart/form-data upload
Submit a request to process a multipart/form-data file upload.
Parameters
fileREQUIREDString
The file being sent for processing. See supported file types.
file_nameString
The filename of the image being sent for processing.
Example:starbucks.jpg
Uploading a zip file as a multipart/form-data is the fastest way to upload files.
Process file using base64 encoded file
Submit a request to process a base64 encoded document.
Parameters
file_dataREQUIREDFile
The Base64 encoded file being sent for processing.
file_nameString
The filename of the image being sent for processing.
Example:starbucks.jpg
Process a document using a URL
Submit a request to process a file using a URL.
Parameters
file_urlREQUIREDString
The publicly accessible URL to the file.
Example:https://cdn.example.com/receipt.jpg
file_dataREQUIREDFile
A list of publicly accessible URLs to multiple files.
Example:["https://cdn.example.com/receipt1.jpg", "https://cdn.example.com/receipt2.jpg"]
Process document(s) with a zip file
Uploading a zip file as a multipart/form-data is the fastest way to upload files.
How to upload a zip file
- Create a zip file containing your files for processing
- Configure Veryfi Client with your
CLIENT_ID
,CLIENT_SECRET
,USERNAME
andAPI_KEY
- Submit the zip file using
multipart/form-data
upload
How to upload a zip file
- Python
import zipfile
from veryfi import Client
list_files = ['receipt1.jpg', 'receipt2.jpg']
zip_file_path = 'receipts.zip'
with zipfile.ZipFile(zip_file_path, 'w') as zipF:
for file in list_files:
zipF.write(file, compress_type=zipfile.ZIP_DEFLATED)
client_id = 'your_client_id'
client_secret = 'your_client_secret'
username = 'your_username'
api_key = 'your_password'
veryfi_client = Client(client_id, client_secret, username, api_key)
response = veryfi_client.process_document(zip_file_path)
How to manually upload a Base64 encoded file
- Compress the files
- Create a Base64 encoded string from the data
- Configure your Client ID and Authorization headers
- Configure Document Processing request parameters
- Submit the POST request for processing
Processing in-memory compressed files
- Python
import io
import zipfile
import base64
import requests
import json
list_files = ['receipt1.jpg', 'receipt2.jpg']
zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, "a", zipfile.ZIP_DEFLATED, False) as zip_file:
for file_name in list_files:
with open(file_name, "rb") as image_file:
zip_file.writestr(file_name, image_file.read())
encode_zip_string = base64.b64encode(zip_buffer.getvalue()).decode("utf-8")
client_id = 'your_client_id'
client_secret = 'your_client_secret'
username = 'your_username'
api_key = 'your_password'
headers = {
"User-Agent": "Python Veryfi-Python/3.0.0",
"Accept": "application/json",
"Content-Type": "application/json",
"Client-Id": client_id,
"Authorization": f"apikey {username}:{api_key}"
}
api_url = "https://api.veryfi.com/api/v8/partner/documents"
request_arguments = {
"file_data": encode_zip_string,
}
_session = requests.Session()
response = _session.request(
"POST",
url=api_url,
headers=headers,
data=json.dumps(request_arguments),
)
How to retrieve extracted data by response type
See Sync vs Async processing to learn how document processing is controlled by the async
request parameter. By default the async
parameter is false, so all the documents processed synchronously return the extracted data in the response.
Synchronous response
All processing requests are synchronous by default. Therefore, a user that submits a POST request for document processing will receive the extracted data in the API response.
Asynchronous response
Asynchronous processing requests receive an immediate response. However, the data extraction runs in a background process. Once data extraction completes, Veryfi makes a request to your configured webhook URL. The webhook URL is configurable in the Keys section of Settings.