Skip to main content

Overview

RecordEngine supports five AI providers, all configurable from Settings → AI Backend without any code changes or server restarts required.
ProviderTypeBest For
Local (Ollama)On-premiseMaximum privacy, air-gapped deployments
OpenAICloud BYOKHigh accuracy, proven reliability
Claude (Anthropic)Cloud BYOKMultilingual, nuanced reasoning
Qwen APICloud BYOKChinese documents
Gemini (Google)Cloud BYOKBest cost/performance ratio

Switching Providers

  1. Go to Settings → AI Backend
  2. Select a provider from the dropdown
  3. Enter your API key (cloud providers only)
  4. Click 💾 Save
  5. Click 🧪 Test Connection to verify
API keys are stored locally in the RecordEngine database on your server. They are never transmitted anywhere except to the selected AI provider.

Provider Details

Local (Ollama)

Fully offline inference on your GPU. No API key required. No data leaves your network. Default models:
TaskModel
Extraction / structured outputqwen3.5:9b
Vision / OCRqwen3.5:9b
Chatqwen3.5:9b
Requirements: GPU instance with NVIDIA drivers and Ollama installed. The deploy script handles this automatically for GPU-mode deployments. Processing modes:
  • Precise — uses qwen3.5:27b for text tasks (higher accuracy, slower)
  • Fast — uses qwen3.5:9b for all tasks (default, recommended)

OpenAI

Default model: gpt-4o
API key: Get yours at platform.openai.com
Pricing: ~2.50/1Minputtokens,2.50 / 1M input tokens, 10.00 / 1M output tokens
GPT-4o handles vision and structured extraction with high accuracy. This is the recommended provider for GPU-mode pilots.

Claude (Anthropic)

Default model: claude-sonnet-4-5
API key: Get yours at console.anthropic.com
Pricing: ~3.00/1Minputtokens,3.00 / 1M input tokens, 15.00 / 1M output tokens
Strong at multilingual documents including Chinese text and nuanced reasoning for AI chat.

Qwen API

Default models: qwen-max (text), qwen-vl-max (vision)
API key: Get yours at dashscope.aliyun.com
Pricing: ~$0.50 / 1M input tokens
Best performance on Chinese-language documents. Uses Alibaba’s DashScope infrastructure — note this for security-conscious prospects even though data stays within Alibaba Cloud.

Gemini (Google)

Default model: gemini-2.5-flash-preview-04-17
API key: Get yours at aistudio.google.com
Pricing: ~0.30/1Minputtokens,0.30 / 1M input tokens, 2.50 / 1M output tokens
Recommended for API-mode (BYOK) deployments where cost efficiency matters. Gemini delivers extraction quality comparable to GPT-4o at approximately 5x lower cost. Excellent OCR performance on handwritten and complex document layouts. To use Gemini 3 Flash, enter gemini-3-flash-preview in the model override field.

Model Override

To use a specific model instead of the provider default:
  1. Go to Settings → AI Backend
  2. Enter the model name in the Model override field
  3. Save
Examples:
  • OpenAI: gpt-4o-mini (cheaper, lower accuracy)
  • Claude: claude-haiku-4-5 (faster, cheaper)
  • Gemini: gemini-3-flash-preview (latest generation)
Leave the field blank to use the provider default.

Fallback to Local

When a cloud provider is configured, you can enable automatic fallback to your local GPU if the cloud API is unavailable:
  1. Enable Fallback to Local GPU if cloud is unavailable in Settings → AI Backend
  2. Save
This requires a GPU instance with Ollama running. Not applicable to API-mode deployments.

Architecture

All AI calls are routed through app/ai_router.py. The three public functions are:
ai_router.ai_generate(task, prompt)        # structured extraction, summaries
ai_router.ai_vision(image_path, prompt)    # OCR, image analysis
ai_router.ai_chat(messages, system_prompt) # multi-turn document chat
Provider selection and API keys are read from the database on every call — no restart required when switching providers. Retry logic: All cloud calls use 3-attempt retry with 2-second delay. Authentication errors (401/403) skip retry immediately.

Per-Provider API Key Storage

Each provider’s key is stored independently:
Setting keyProvider
ai_api_key_openaiOpenAI
ai_api_key_claudeClaude
ai_api_key_qwen_apiQwen API
ai_api_key_geminiGemini
Switching providers never overwrites another provider’s key.