Skip to content

Webhook Integration

Webhook integration allows you to receive real-time notifications when documents are analyzed in Recognito. This enables you to build automated workflows that respond to document processing events.

Overview

When webhook integration is enabled, Recognito will send HTTP POST requests to your specified endpoint after a document has been successfully analyzed. This allows your application to:

  • Receive immediate notifications when document analysis is complete
  • Automatically process analyzed document data
  • Trigger downstream workflows based on document results
  • Integrate Recognito data into your existing systems

Enabling Webhook Integration

To enable webhook integration for your project:

  1. Navigate to your project in the Recognito application
  2. Go to SettingsIntegrations
  3. Click Create new Integration
  4. In the modal, select Webhook Integration
  5. Configure the following options:
    • Webhook URL: Enter your webhook endpoint URL
    • Status: Set to Enabled or Disabled
    • Trigger Event: Select when to trigger the webhook:
      • Document Analysed: Trigger immediately after document analysis is complete
      • Document Approved: Trigger only after the document has been manually approved
  6. Save the configuration

Your webhook endpoint must:

  • Be publicly accessible via HTTPS
  • Accept HTTP POST requests with multipart/form-data
  • Return a 2xx status code to acknowledge receipt

Webhook Request

When a document is analyzed (or approved, depending on your trigger setting), Recognito will send a POST request to your configured webhook endpoint.

Headers

HeaderValueDescription
Content-Typemultipart/form-dataThe request body contains both files and form data
User-AgentRecognito-Webhook/1.0Identifies the request as coming from Recognito

Request Body

The webhook request is sent as multipart/form-data containing:

  • Files: The original document file(s) uploaded by the user
  • Form values: A JSON string containing the extracted document data

Form Data Structure

The form data contains a JSON object with the following structure:

FieldTypeDescription
project_uuidstringThe UUID of the project
documentobjectDocument details and extracted data
document.uuidstringThe unique identifier for the document
document.dataobjectExtracted document data
document.data.docTypestringType of document (e.g., "invoice", "receipt")
document.data.mainFieldsobjectKey-value pairs of main extracted fields
document.data.metadataFieldsobjectKey-value pairs of metadata fields (optional)
document.data.tableFieldsobjectTable data with items array (optional)
document.data.taxFieldsobjectTax-related data with items array (optional)
document.data.paymentFieldsobjectPayment-related data with items array (optional)
document.contentstringRaw text content extracted from the document
document.original_filenamestringOriginal filename of the uploaded document
document.review_statusstringCurrent review status (e.g., "pending", "approved")
document.uploader_emailstringEmail address of the user who uploaded the document

Extracted Field Structure

Main Fields and Metadata Fields

Each field in document.data.mainFields and document.data.metadataFields contains:

PropertyTypeDescription
typestringField data type (e.g., "string", "date", "number")
contentstringThe extracted content value
confidencenumberConfidence score (0-1) for the extraction
pageNumberintegerPage number where the field was found
boundingRegionsarrayCoordinates of the field location on the page
contentByTypestring/objectType-specific formatted content
userFriendlyNamestringDisplay name for the field
originalKeystringOriginal field key identifier
Table Fields, Tax Fields, and Payment Fields

The document.data.tableFields, document.data.taxFields, and document.data.paymentFields objects contain:

  • items: An array of objects, where each object contains field properties similar to mainFields

Each item in the items array has the same structure as fields in mainFields (type, content, confidence, pageNumber, boundingRegions, contentByType, userFriendlyName, originalKey).

Example structure:

json
{
  "tableFields": {
    "items": [
      {
        "FieldName": {
          "type": "string",
          "content": "value",
          "confidence": 0.95,
          "pageNumber": 1,
          "boundingRegions": [...],
          "contentByType": "value",
          "userFriendlyName": "Field Name",
          "originalKey": "FieldName"
        }
      }
    ]
  }
}

Example Request

Form Data (JSON)

json
{
  "project_uuid": "affa965a-896d-42f9-a59f-d3ab95993aa2",
  "document": {
    "uuid": "f5009665-4d32-4bbc-b70a-d12a69f0a1e5",
    "data": {
      "docType": "invoice",
      "mainFields": {
        "InvoiceId": {
          "type": "string",
          "content": "INV-2025-001",
          "confidence": 0.989,
          "pageNumber": 1,
          "originalKey": "InvoiceId",
          "contentByType": "INV-2025-001",
          "boundingRegions": [11.8927, 12.4356, 14.9534, 12.4204, 14.9534, 12.9379, 11.8927, 12.9379],
          "userFriendlyName": "InvoiceId"
        },
        "InvoiceDate": {
          "type": "date",
          "content": "2025-01-15",
          "confidence": 0.989,
          "pageNumber": 1,
          "originalKey": "InvoiceDate",
          "contentByType": "2025-01-15",
          "boundingRegions": [11.9079, 13.1967, 14.39, 13.1967, 14.39, 13.6686, 11.9079, 13.6533],
          "userFriendlyName": "InvoiceDate"
        },
        "DueDate": {
          "type": "date",
          "content": "2026-02-15",
          "confidence": 0.988,
          "pageNumber": 1,
          "originalKey": "DueDate",
          "contentByType": "2026-02-15",
          "boundingRegions": [11.8927, 13.973, 14.4204, 13.9578, 14.4052, 14.4144, 11.8927, 14.3992],
          "userFriendlyName": "DueDate"
        },
        "VendorName": {
          "type": "string",
          "content": "Random Inc.",
          "confidence": 0.698,
          "pageNumber": 1,
          "originalKey": "VendorName",
          "contentByType": "Random Inc.",
          "boundingRegions": [4.7815, 6.6821, 7.5116, 6.705, 7.5069, 7.2679, 4.7768, 7.245],
          "userFriendlyName": "VendorName"
        },
        "CustomerName": {
          "type": "string",
          "content": "Recognito Inc.",
          "confidence": 0.968,
          "pageNumber": 1,
          "originalKey": "CustomerName",
          "contentByType": "Recognito Inc.",
          "boundingRegions": [12.9013, 7.4581, 16.2817, 7.4797, 16.2781, 8.0431, 12.8977, 8.0215],
          "userFriendlyName": "CustomerName"
        },
        "VendorAddress": {
          "type": "address",
          "content": "123 Random Drive, Suite 100",
          "confidence": 0.918,
          "pageNumber": 1,
          "originalKey": "VendorAddress",
          "contentByType": {
            "road": "Random Drive",
            "unit": "Suite 100",
            "houseNumber": "123",
            "streetAddress": "123 Random Drive Suite 100"
          },
          "boundingRegions": [4.8119, 7.4431, 11.0856, 7.4431, 11.0856, 8.0063, 4.8119, 8.0063],
          "userFriendlyName": "VendorAddress"
        },
        "BillingAddress": {
          "type": "address",
          "content": "456 Recognito Avenue, New\nYork, NY 10001",
          "confidence": 0.888,
          "pageNumber": 1,
          "originalKey": "BillingAddress",
          "contentByType": {
            "city": "New\nYork",
            "road": "Recognito Avenue",
            "state": "NY",
            "postalCode": "10001",
            "houseNumber": "456",
            "streetAddress": "456 Recognito Avenue"
          },
          "boundingRegions": [12.8977, 8.189, 19.1105, 8.189, 19.1105, 9.1936, 12.8977, 9.1936],
          "userFriendlyName": "BillingAddress"
        }
      }
    },
    "content": "INVOICE\nrandom Inc.\nBill To:\n123 random Drive, Suite 100\nRecognito Inc.\n(555) 123-4567\n456 Recognito Avenue, New\nbilling@random.com\nYork, NY 10001\n(555) 987-6543\nap@recognito.io\nInvoice #:\nINV-2025-001\nDate:\n2025-01-15\nDue Date:\n2026-02-15\nPO-RECOGNITO-2025-\nPO #:\n001\n2025-01-01 to\nService Period:\n2025-01-31",
    "original_filename": "random-demo-invoice",
    "review_status": "pending",
    "uploader_email": "user@recognito.io"
  }
}

Files

The original uploaded document file(s) are included in the multipart/form-data request.

Response

Your webhook endpoint should respond with a 2xx status code (e.g., 200 OK) to acknowledge successful receipt of the webhook.

json
{
  "status": "received"
}

If your endpoint returns an error status code (4xx or 5xx), Recognito will retry the webhook delivery according to the retry policy.

Retry Policy

If webhook delivery fails, Recognito will automatically retry:

  • First retry: After 1 minute
  • Second retry: After 5 minutes
  • Third retry: After 10 minutes

After all retries are exhausted, the webhook delivery is marked as failed and will not be retried automatically.

Security

Validating Webhook Requests

To ensure webhook requests are genuinely from Recognito, you should:

  1. Validate that requests come from Recognito Domain
  2. Use HTTPS endpoints to ensure encrypted communication

Testing Your Webhook

Before enabling webhook integration in production:

  1. Set up a test endpoint using tools like webhook.site or requestbin.com
  2. Enable webhook integration with the test URL
  3. Upload and analyze a test document
  4. Verify that your endpoint receives the webhook request
  5. Check the request structure and data format

Example Implementation

Node.js/Express

javascript
const express = require('express');
const multer = require('multer');
const app = express();

// Configure multer for handling multipart/form-data
const upload = multer({ dest: 'uploads/' });

app.post('/webhook', upload.any(), (req, res) => {
  try {
    // Parse the form data (JSON string)
    const formData = JSON.parse(req.body.data || '{}');
    
    // Access the uploaded files
    const files = req.files;
    
    console.log('Project UUID:', formData.project_uuid);
    console.log('Document UUID:', formData.document.uuid);
    console.log('Document Type:', formData.document.data.docType);
    console.log('Extracted Fields:', formData.document.data.mainFields);
    console.log('Uploaded files:', files);
    
    // Process the document data
    // Your business logic here
    
    // Respond with success
    res.status(200).json({ status: 'received' });
  } catch (error) {
    console.error('Error processing webhook:', error);
    res.status(500).json({ error: 'Failed to process webhook' });
  }
});

app.listen(3000, () => {
  console.log('Webhook server listening on port 3000');
});

Python/Flask

python
from flask import Flask, request, jsonify
import json

app = Flask(__name__)

@app.route('/webhook', methods=['POST'])
def webhook():
    try:
        # Parse the form data (JSON string)
        form_data = json.loads(request.form.get('data', '{}'))
        
        # Access the uploaded files
        files = request.files.getlist('files')
        
        print('Project UUID:', form_data.get('project_uuid'))
        print('Document UUID:', form_data['document']['uuid'])
        print('Document Type:', form_data['document']['data']['docType'])
        print('Extracted Fields:', form_data['document']['data']['mainFields'])
        print('Uploaded files:', files)
        
        # Process the document data
        # Your business logic here
        
        # Save files if needed
        for file in files:
            file.save(f'uploads/{file.filename}')
        
        # Respond with success
        return jsonify({'status': 'received'}), 200
    except Exception as e:
        print('Error processing webhook:', str(e))
        return jsonify({'error': 'Failed to process webhook'}), 500

if __name__ == '__main__':
    app.run(port=3000)

Troubleshooting

Webhook Not Being Received

  • Verify your endpoint URL is correct and publicly accessible
  • Check that your server is accepting POST requests
  • Ensure your firewall allows incoming connections
  • Verify HTTPS certificate is valid (if using HTTPS)

Webhook Timing Out

  • Ensure your endpoint responds within 30 seconds
  • Process webhook data asynchronously if needed
  • Return 200 OK immediately, then process in background

Webhook Data Issues

  • Verify you're parsing the JSON body correctly
  • Log the raw request body for debugging