The Core Idea
When a document is processed in RecordEngine, it exists in isolation — it has a contact, a folder, extracted fields, and a status. But in most organisations, that document also belongs to something in another system: a deal in the CRM, a project in the ERP, a case in the helpdesk. CRM correlation solves this by attaching a reference object to the document that says: “This document is also associated with record X in system Y, and you can find it at URL Z.” That reference travels with the document everywhere — it’s stored in the database, returned by the API, and included in every outbound webhook payload.The external_refs Field
Every document in RecordEngine has an external_refs field — a JSON object that can hold references to records in any number of external systems simultaneously.
salesforce, hubspot) are arbitrary identifiers — you choose them. RecordEngine doesn’t validate or interpret them; it stores and returns them exactly as provided.
Three Patterns
There are three common ways CRM correlation gets used in practice:Pattern 1 — CRM Pushes Documents to RecordEngine
Your CRM or automation platform detects a trigger (deal stage change, new contact, incoming email) and uploads a document to RecordEngine via the API, passing the CRM record ID inexternal_refs at upload time.
Pattern 2 — RecordEngine Pushes Data to CRM
Documents arrive in RecordEngine directly (via upload, email, or hot folder). When they’re exported, the webhook payload — including anyexternal_refs set manually — is received by an automation that updates the appropriate CRM record.
Pattern 3 — Bidirectional Sync
Both systems push and pull. RecordEngine processes incoming documents and pushes extracted data to the CRM. The CRM triggers document uploads and stores the RecordEnginedoc_url deep-link for one-click navigation back to the document.
Setting external_refs at Upload
Pass external_refs as a JSON string in the upload API request:
Updating external_refs After Upload
If a document was uploaded without external_refs, you can add them later:
external_refs in the Webhook Payload
Every outbound webhook payload includes the full external_refs object. Your automation uses it to find the right record in the destination system:
external_refs.salesforce.record_id to know which Salesforce record to update.
The doc_url Deep-Link
Every webhook payload also includes doc_url — a direct URL that opens the document in the RecordEngine UI. This is intended to be stored on the CRM record so your team can jump from a deal or contact straight to the full document, extracted fields, and approval status.
In Salesforce, store it in a custom URL field on the Opportunity. In HubSpot, store it in a custom Deal property. Your team then has one-click navigation between the CRM and RecordEngine without searching.
See Deep-Links for how the URL pattern works.
Why Not a Native Integration?
RecordEngine deliberately uses a genericexternal_refs field rather than native per-system integrations. The reasons:
- Works with any system — any CRM, ERP, or custom application with a unique record ID can be linked, not just the ones with built-in connectors
- No credentials stored in RecordEngine — the CRM credentials live in your automation platform, not in RecordEngine, which simplifies security and compliance
- Logic lives in your automation — field mapping, deduplication, and error handling are configured in your automation platform where they’re visible and auditable
- Model-agnostic — when RecordEngine’s AI model changes or upgrades, the integration doesn’t break because it doesn’t depend on model-specific outputs