Live App →

Document Intelligence

Scope — End-to-end AI pipeline that ingests unstructured financial PDFs and images, classifies document types, extracts structured securities data, and produces review-ready outputs with full provenance tracking.


Executive Summary

Document Intelligence is the foundational AI capability of the Sentinel platform. It eliminates manual data entry from portfolio statements, CAS reports, AIF statements, KYC forms, and MF order forms. The Nexus backend orchestrates a 12-stage agentic pipeline powered by multiple LLM providers (AWS Bedrock, OpenAI, Kimi) via the LLM Invocation Orchestrator, delivering sub-minute extraction for standard documents and robust handling for multi-entity, multi-page complex statements.


The Problem

Wealth management firms process thousands of financial documents monthly:

  • Portfolio statements from 50+ AMCs with non-standard layouts
  • AIF (Alternative Investment Fund) statements with complex L1/L2/L3 holdings
  • KYC forms and MF order forms with handwritten fields
  • CAS (Consolidated Account Statements) spanning years of transactions

Manual extraction is error-prone, slow, and non-scalable. Legacy OCR tools fail on Indian financial document layouts and lack domain-specific entity recognition.


Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│  Sentinel UI    │────▶│ Studio Middleware│────▶│  Nexus Backend  │
│  (Upload Drop)  │     │  (Auth + Proxy)  │     │ (12-Stage Pipe) │
└─────────────────┘     └──────────────────┘     └─────────────────┘
                              │                           │
                              ▼                           ▼
                        ┌──────────┐              ┌──────────────┐
                        │  Redis   │              │ Celery Worker│
                        │  Queue   │              │ (Light/Heavy/XL)
                        └──────────┘              └──────────────┘
                                                           │
                              ┌────────────────────────────┘
                              ▼
                    ┌─────────────────────┐
                    │  LLM Orchestrator   │
                    │  (Bedrock / OpenAI) │
                    └─────────────────────┘

Pipeline Stages

Stage Agent Purpose
1 UploadValidator File type, size, corruption checks
2 DocumentLoader PDF → image rendering, page splitting
3 ClassificationAgent CAS, AIF, KYC, Order Form, General
4 LayoutExtractor Visual region detection (tables, headers)
5 DocumentStructureAgent Hierarchical section mapping
6 EntityDetectionAgent Client, account, scheme isolation
7 ParallelExtractionAgent Securities, NAVs, dates, amounts
8 CrossPageStitcher Multi-page table reconciliation
9 ValidationAgent Cross-field consistency, sum checks
10 MetadataExtractionAgent Bloomberg-style rich metadata
11 SummarizerAgent Human-readable extraction summary
12 StorageAgent Persist to MongoDB + OpenSearch

OCR Branch — Handwritten forms route through KimiOCRAgent (Kimi k2.5 via Bedrock) for specialized handwriting recognition.


Personas & Journeys

Wealth Manager / Relationship Manager

  1. Uploads a client’s CAS PDF via Sentinel drag-and-drop
  2. Watches real-time progress bar (stage-by-stage updates via polling)
  3. Reviews extracted holdings table with confidence scores
  4. Downloads Excel export for client presentation
  5. Flags uncertain rows for operations review

Operations Analyst

  1. Monitors pipeline dashboard for failed extractions
  2. Opens review queue for low-confidence fields (< 0.85)
  3. Corrects mismatched ISINs or schemes
  4. Submits review — pipeline re-runs validation stage
  5. Tracks token cost per job for budget reconciliation

Product Admin

  1. Configures schema templates for new document types
  2. Manages cache warming for frequent AMC layouts
  3. Monitors extraction accuracy trends per document class
  4. Adjusts confidence thresholds by tenant

Key Features

Feature Detail
Multi-Entity Extraction Detects and isolates multiple clients/accounts in a single document
Schema-Driven Output L1 (Transactions), L2 (Holdings), L3 (Underlying), L1+L2 (AIF SoA)
Excel Export Custom-formatted export with preview mode before download
Human Review Confidence-gated review workflow with per-field accept/reject/edit
Token Cost Tracking Per-document INR cost breakdown by model and stage
Async Queues Light (≤10 pp), Heavy (11-100 pp), XL (>100 pp) with Celery + Redis
Webhook Callbacks Notifies Zen chatbot and external systems on completion

API Surface

All endpoints are proxied through Studio Middleware at /api/v1/nexus/*.

Method Endpoint Purpose
POST /api/v1/upload/ Sync upload with pre-classification
POST /api/v2/pipeline/process Async pipeline kickoff
GET /api/v2/status/process/{id}/progress Real-time stage progress
GET /api/v1/export/{job_id}/excel Excel download
PUT /api/v1/review/{review_id} Submit human review
GET /api/v1/token-usage/{doc_id} Cost and token audit

Security, Compliance & Operations

  • Data Residency — All document binaries stored in tenant-scoped S3 buckets (ap-south-1)
  • Encryption — AES-256 at rest (S3 SSE), TLS 1.3 in transit
  • PII Handling — Aadhaar, PAN, IFSC, UPI identifiers masked before LLM calls via Orchestrator
  • Retention — Configurable per tenant; default 7 years for financial records
  • Audit — Every extraction job logged with full provenance (user, timestamp, model versions, confidence scores)