Document Intelligence

Scope — End-to-end AI pipeline that ingests unstructured financial PDFs and images, classifies document types, extracts structured securities data, and produces review-ready outputs with full provenance tracking.

Executive Summary

Document Intelligence is the foundational AI capability of the Sentinel platform. It eliminates manual data entry from portfolio statements, CAS reports, AIF statements, KYC forms, and MF order forms. The Nexus backend orchestrates a 12-stage agentic pipeline powered by multiple LLM providers (AWS Bedrock, OpenAI, Kimi) via the LLM Invocation Orchestrator, delivering sub-minute extraction for standard documents and robust handling for multi-entity, multi-page complex statements.

The Problem

Wealth management firms process thousands of financial documents monthly:

Portfolio statements from 50+ AMCs with non-standard layouts
AIF (Alternative Investment Fund) statements with complex L1/L2/L3 holdings
KYC forms and MF order forms with handwritten fields
CAS (Consolidated Account Statements) spanning years of transactions

Manual extraction is error-prone, slow, and non-scalable. Legacy OCR tools fail on Indian financial document layouts and lack domain-specific entity recognition.

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│  Sentinel UI    │────▶│ Studio Middleware│────▶│  Nexus Backend  │
│  (Upload Drop)  │     │  (Auth + Proxy)  │     │ (12-Stage Pipe) │
└─────────────────┘     └──────────────────┘     └─────────────────┘
                              │                           │
                              ▼                           ▼
                        ┌──────────┐              ┌──────────────┐
                        │  Redis   │              │ Celery Worker│
                        │  Queue   │              │ (Light/Heavy/XL)
                        └──────────┘              └──────────────┘
                                                           │
                              ┌────────────────────────────┘
                              ▼
                    ┌─────────────────────┐
                    │  LLM Orchestrator   │
                    │  (Bedrock / OpenAI) │
                    └─────────────────────┘

Pipeline Stages

Stage	Agent	Purpose
1	`UploadValidator`	File type, size, corruption checks
2	`DocumentLoader`	PDF → image rendering, page splitting
3	`ClassificationAgent`	CAS, AIF, KYC, Order Form, General
4	`LayoutExtractor`	Visual region detection (tables, headers)
5	`DocumentStructureAgent`	Hierarchical section mapping
6	`EntityDetectionAgent`	Client, account, scheme isolation
7	`ParallelExtractionAgent`	Securities, NAVs, dates, amounts
8	`CrossPageStitcher`	Multi-page table reconciliation
9	`ValidationAgent`	Cross-field consistency, sum checks
10	`MetadataExtractionAgent`	Bloomberg-style rich metadata
11	`SummarizerAgent`	Human-readable extraction summary
12	`StorageAgent`	Persist to MongoDB + OpenSearch

OCR Branch — Handwritten forms route through KimiOCRAgent (Kimi k2.5 via Bedrock) for specialized handwriting recognition.

Personas & Journeys

Wealth Manager / Relationship Manager

Uploads a client’s CAS PDF via Sentinel drag-and-drop
Watches real-time progress bar (stage-by-stage updates via polling)
Reviews extracted holdings table with confidence scores
Downloads Excel export for client presentation
Flags uncertain rows for operations review

Operations Analyst

Monitors pipeline dashboard for failed extractions
Opens review queue for low-confidence fields (< 0.85)
Corrects mismatched ISINs or schemes
Submits review — pipeline re-runs validation stage
Tracks token cost per job for budget reconciliation

Product Admin

Configures schema templates for new document types
Manages cache warming for frequent AMC layouts
Monitors extraction accuracy trends per document class
Adjusts confidence thresholds by tenant

Key Features

Feature	Detail
Multi-Entity Extraction	Detects and isolates multiple clients/accounts in a single document
Schema-Driven Output	L1 (Transactions), L2 (Holdings), L3 (Underlying), L1+L2 (AIF SoA)
Excel Export	Custom-formatted export with preview mode before download
Human Review	Confidence-gated review workflow with per-field accept/reject/edit
Token Cost Tracking	Per-document INR cost breakdown by model and stage
Async Queues	Light (≤10 pp), Heavy (11-100 pp), XL (>100 pp) with Celery + Redis
Webhook Callbacks	Notifies Zen chatbot and external systems on completion

API Surface

All endpoints are proxied through Studio Middleware at /api/v1/nexus/*.

Method	Endpoint	Purpose
`POST`	`/api/v1/upload/`	Sync upload with pre-classification
`POST`	`/api/v2/pipeline/process`	Async pipeline kickoff
`GET`	`/api/v2/status/process/{id}/progress`	Real-time stage progress
`GET`	`/api/v1/export/{job_id}/excel`	Excel download
`PUT`	`/api/v1/review/{review_id}`	Submit human review
`GET`	`/api/v1/token-usage/{doc_id}`	Cost and token audit

Security, Compliance & Operations

Data Residency — All document binaries stored in tenant-scoped S3 buckets (ap-south-1)
Encryption — AES-256 at rest (S3 SSE), TLS 1.3 in transit
PII Handling — Aadhaar, PAN, IFSC, UPI identifiers masked before LLM calls via Orchestrator
Retention — Configurable per tenant; default 7 years for financial records
Audit — Every extraction job logged with full provenance (user, timestamp, model versions, confidence scores)

Wealth Understanding — Document Q&A over extracted holdings
Portfolio Intelligence — Analytics computed from extracted data
Digital Advisor — Chat interface that triggers document uploads

Document Intelligence

Document Intelligence

Executive Summary

The Problem

Architecture

Pipeline Stages

Personas & Journeys

Wealth Manager / Relationship Manager

Operations Analyst

Product Admin

Key Features

API Surface

Security, Compliance & Operations

Related Capabilities